From lidia.varsh at mail.ioffe.ru  Wed Jun  1 12:37:57 2022
From: lidia.varsh at mail.ioffe.ru (Lidia)
Date: Wed, 1 Jun 2022 20:37:57 +0300
Subject: [petsc-users] Sparse linear system solving
In-Reply-To: <CAMYG4G=2+46y86OfYccv4UJV991gyOJndaZC_7UuZM-aanDnOA@mail.gmail.com>
References: <026d55af-c978-81da-f571-46519e5e6f8e@mail.ioffe.ru>
	<CAMYG4Gn=ngVrYCyjgkBgzWO+3eVFhwJ6xhqb_XWhAocbaWM_pA@mail.gmail.com>
	<CADOhEh6uO9-iH6K0Wy-0ZeJv4TwMkYauFfqB-ceiMtP3PgvvHg@mail.gmail.com>
	<2e7ebbf1-511a-7055-ff92-131d3bf73f1e@mail.ioffe.ru>
	<CAMYG4Gk8V6AJdQzjcwX-t_TVf5cSgSs6EQVAZXaQZ8GSsr7iXQ@mail.gmail.com>
	<CAMYG4G=2+46y86OfYccv4UJV991gyOJndaZC_7UuZM-aanDnOA@mail.gmail.com>
Message-ID: <201b6c28-2616-edaa-21dd-ee91a257c432@mail.ioffe.ru>

Dear Matt,

Thank you for the rule of 10,000 variables per process! We have run ex.5 
with matrix 1e4 x 1e4 at our cluster and got a good performance dynamics 
(see the figure "performance.png" - dependency of the solving time in 
seconds on the number of cores). We have used GAMG preconditioner 
(multithread: we have added the option 
"-pc_gamg_use_parallel_coarse_grid_solver") and GMRES solver. And we 
have set one openMP thread to every MPI process. Now the ex.5 is working 
good on many mpi processes! But the running uses about 100 GB of RAM.

How we can run ex.5 using many openMP threads without mpi? If we just 
change the running command, the cores are not loaded normally: usually 
just one core is loaded in 100 % and others are idle. Sometimes all 
cores are working in 100 % during 1 second but then again become idle 
about 30 seconds. Can the preconditioner use many threads and how to 
activate this option?

The solving times (the time of the solver work) using 60 openMP threads 
is 511 seconds now, and while using 60 MPI processes - 13.19 seconds.

ksp_monitor outs for both cases (many openMP threads or many MPI 
processes) are attached.


Thank you!

Best,
Lidia

On 31.05.2022 15:21, Matthew Knepley wrote:
> I have looked at the local logs. First, you have run problems of size 
> 12? and 24. As a rule of thumb, you need 10,000
> variables per process in order to see good speedup.
>
> ? Thanks,
>
> ? ? ?Matt
>
> On Tue, May 31, 2022 at 8:19 AM Matthew Knepley <knepley at gmail.com> wrote:
>
>     On Tue, May 31, 2022 at 7:39 AM Lidia <lidia.varsh at mail.ioffe.ru>
>     wrote:
>
>         Matt, Mark, thank you much for your answers!
>
>
>         Now we have run example # 5 on our computer cluster and on the
>         local server and also have not seen any performance increase,
>         but by unclear reason running times on the local server are
>         much better than on the cluster.
>
>     I suspect that you are trying to get speedup without increasing
>     the memory bandwidth:
>
>     https://petsc.org/main/faq/#what-kind-of-parallel-computers-or-clusters-are-needed-to-use-petsc-or-why-do-i-get-little-speedup
>
>     ? Thanks,
>
>     ? ? ?Matt
>
>         Now we will try to run petsc #5 example inside a docker
>         container on our server and see if the problem is in our
>         environment. I'll write you the results of this test as soon
>         as we get it.
>
>         The ksp_monitor outs for the 5th test at the current local
>         server configuration (for 2 and 4 mpi processes) and for the
>         cluster (for 1 and 3 mpi processes) are attached .
>
>
>         And one more question. Potentially we can use 10 nodes and 96
>         threads at each node on our cluster. What do you think, which
>         combination of numbers of mpi processes and openmp threads may
>         be the best for the 5th example?
>
>         Thank you!
>
>
>         Best,
>         Lidiia
>
>         On 31.05.2022 05:42, Mark Adams wrote:
>>         And if you see "NO" change in performance I suspect the
>>         solver/matrix is all on one processor.
>>         (PETSc does not use threads by default so threads should not
>>         change anything).
>>
>>         As Matt said, it is best to start with a PETSc example?that
>>         does something like what you want (parallel linear solve, see
>>         src/ksp/ksp/tutorials for examples), and then add your code
>>         to it.
>>         That way you get the basic infrastructure?in place for you,
>>         which is pretty obscure to the uninitiated.
>>
>>         Mark
>>
>>         On Mon, May 30, 2022 at 10:18 PM Matthew Knepley
>>         <knepley at gmail.com> wrote:
>>
>>             On Mon, May 30, 2022 at 10:12 PM Lidia
>>             <lidia.varsh at mail.ioffe.ru> wrote:
>>
>>                 Dear colleagues,
>>
>>                 Is here anyone who have solved big sparse linear
>>                 matrices using PETSC?
>>
>>
>>             There are lots of publications with this kind of data.
>>             Here is one recent one: https://arxiv.org/abs/2204.01722
>>
>>                 We have found NO performance improvement while using
>>                 more and more mpi
>>                 processes (1-2-3) and open-mp threads (from 1 to 72
>>                 threads). Did anyone
>>                 faced to this problem? Does anyone know any possible
>>                 reasons of such
>>                 behaviour?
>>
>>
>>             Solver behavior is dependent on the input matrix. The
>>             only general-purpose solvers
>>             are direct, but they do not scale linearly and have high
>>             memory requirements.
>>
>>             Thus, in order to make progress you will have to be
>>             specific about your matrices.
>>
>>                 We use AMG preconditioner and GMRES solver from KSP
>>                 package, as our
>>                 matrix is large (from 100 000 to 1e+6 rows and
>>                 columns), sparse,
>>                 non-symmetric and includes both positive and negative
>>                 values. But
>>                 performance problems also exist while using CG
>>                 solvers with symmetric
>>                 matrices.
>>
>>
>>             There are many PETSc examples, such as example 5 for the
>>             Laplacian, that exhibit
>>             good scaling with both AMG and GMG.
>>
>>                 Could anyone help us to set appropriate options of
>>                 the preconditioner
>>                 and solver? Now we use default parameters, maybe they
>>                 are not the best,
>>                 but we do not know a good combination. Or maybe you
>>                 could suggest any
>>                 other pairs of preconditioner+solver for such tasks?
>>
>>                 I can provide more information: the matrices that we
>>                 solve, c++ script
>>                 to run solving using petsc and any statistics
>>                 obtained by our runs.
>>
>>
>>             First, please provide a description of the linear system,
>>             and the output of
>>
>>             ? -ksp_view -ksp_monitor_true_residual
>>             -ksp_converged_reason -log_view
>>
>>             for each test case.
>>
>>             ? Thanks,
>>
>>             ? ? ?Matt
>>
>>                 Thank you in advance!
>>
>>                 Best regards,
>>                 Lidiia Varshavchik,
>>                 Ioffe Institute, St. Petersburg, Russia
>>
>>
>>
>>             -- 
>>             What most experimenters take for granted before they
>>             begin their experiments is infinitely more interesting
>>             than any results to which their experiments lead.
>>             -- Norbert Wiener
>>
>>             https://www.cse.buffalo.edu/~knepley/
>>             <http://www.cse.buffalo.edu/~knepley/>
>>
>
>
>     -- 
>     What most experimenters take for granted before they begin their
>     experiments is infinitely more interesting than any results to
>     which their experiments lead.
>     -- Norbert Wiener
>
>     https://www.cse.buffalo.edu/~knepley/
>     <http://www.cse.buffalo.edu/~knepley/>
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/ 
> <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220601/a441c088/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: performance.png
Type: image/png
Size: 16053 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220601/a441c088/attachment-0001.png>
-------------- next part --------------
[lida at head1 tutorials]$ ./ex5 -m 10000 -ksp_monitor -ksp_monitor_true_residual -ksp_converged_reason -log_view -pc_type gamg -ksp_type gmres -pc_gamg_use_parallel_coarse_grid_solver  
--------------------------------------------------------------------------
WARNING: No preset parameters were found for the device that Open MPI
detected:

  Local host:            head1
  Device name:           i40iw0
  Device vendor ID:      0x8086
  Device vendor part ID: 14290

Default device parameters will be used, which may result in lower
performance.  You can edit any of the files specified by the
btl_openib_device_param_files MCA parameter to set values for your
device.

NOTE: You can turn off this warning by setting the MCA parameter
      btl_openib_warn_no_device_params_found to 0.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
No OpenFabrics connection schemes reported that they were able to be
used on a specific port.  As such, the openib BTL (OpenFabrics
support) will be disabled for this port.

  Local host:           head1
  Local device:         i40iw0
  Local port:           1
  CPCs attempted:       rdmacm, udcm
--------------------------------------------------------------------------
[head1.hpc:274354] 1 more process has sent help message help-mpi-btl-openib.txt / no device params found
[head1.hpc:274354] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
[head1.hpc:274354] 1 more process has sent help message help-mpi-btl-openib-cpc-base.txt / no cpcs for port
  0 KSP Residual norm 2.037538277184e+11 
  0 KSP preconditioned resid norm 2.037538277184e+11 true resid norm 1.291188079508e+10 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP Residual norm 4.559847344082e+10 
  1 KSP preconditioned resid norm 4.559847344082e+10 true resid norm 1.145337105566e+10 ||r(i)||/||b|| 8.870412635802e-01
  2 KSP Residual norm 1.458580410483e+10 
  2 KSP preconditioned resid norm 1.458580410483e+10 true resid norm 6.820359295573e+09 ||r(i)||/||b|| 5.282235333346e-01
  3 KSP Residual norm 5.133668905377e+09 
  3 KSP preconditioned resid norm 5.133668905377e+09 true resid norm 3.443273018496e+09 ||r(i)||/||b|| 2.666747837238e-01
  4 KSP Residual norm 1.822791754681e+09 
  4 KSP preconditioned resid norm 1.822791754681e+09 true resid norm 1.429794150530e+09 ||r(i)||/||b|| 1.107347700325e-01
  5 KSP Residual norm 6.883552291389e+08 
  5 KSP preconditioned resid norm 6.883552291389e+08 true resid norm 5.284618300965e+08 ||r(i)||/||b|| 4.092833867378e-02
  6 KSP Residual norm 2.738661252083e+08 
  6 KSP preconditioned resid norm 2.738661252083e+08 true resid norm 2.298184687591e+08 ||r(i)||/||b|| 1.779899244785e-02
  7 KSP Residual norm 1.175295112233e+08 
  7 KSP preconditioned resid norm 1.175295112233e+08 true resid norm 9.785469137958e+07 ||r(i)||/||b|| 7.578655110947e-03
  8 KSP Residual norm 4.823372166305e+07 
  8 KSP preconditioned resid norm 4.823372166305e+07 true resid norm 4.288291058318e+07 ||r(i)||/||b|| 3.321197838159e-03
  9 KSP Residual norm 2.019815757215e+07 
  9 KSP preconditioned resid norm 2.019815757215e+07 true resid norm 1.776678838786e+07 ||r(i)||/||b|| 1.376003129973e-03
 10 KSP Residual norm 8.776441360510e+06 
 10 KSP preconditioned resid norm 8.776441360510e+06 true resid norm 7.333797620917e+06 ||r(i)||/||b|| 5.679883308489e-04
 11 KSP Residual norm 3.536170852140e+06 
 11 KSP preconditioned resid norm 3.536170852140e+06 true resid norm 3.517014965376e+06 ||r(i)||/||b|| 2.723859537734e-04
 12 KSP Residual norm 1.369320429479e+06 
 12 KSP preconditioned resid norm 1.369320429479e+06 true resid norm 1.434993628816e+06 ||r(i)||/||b|| 1.111374594910e-04
Linear solve converged due to CONVERGED_RTOL iterations 12
time 511.480000 m=10000 n=10000
Norm of error 4.8462e+06, Iterations 12
  0 KSP Residual norm 6.828607739124e+09 
  0 KSP preconditioned resid norm 6.828607739124e+09 true resid norm 2.081798084592e+10 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP Residual norm 1.592108138342e+08 
  1 KSP preconditioned resid norm 1.592108138342e+08 true resid norm 1.085557726631e+09 ||r(i)||/||b|| 5.214519768589e-02
  2 KSP Residual norm 4.713015543535e+06 
  2 KSP preconditioned resid norm 4.713015543535e+06 true resid norm 2.310928708753e+07 ||r(i)||/||b|| 1.110063807752e-03
  3 KSP Residual norm 3.998043547851e+05 
  3 KSP preconditioned resid norm 3.998043547851e+05 true resid norm 2.247029256835e+06 ||r(i)||/||b|| 1.079369451565e-04
  4 KSP Residual norm 3.507419330164e+04 
  4 KSP preconditioned resid norm 3.507419330164e+04 true resid norm 2.008185753840e+05 ||r(i)||/||b|| 9.646400237870e-06
Linear solve converged due to CONVERGED_RTOL iterations 4
Norm of error 5.77295e+11, Iterations 4
**************************************** ***********************************************************************************************************************
***                                WIDEN YOUR WINDOW TO 160 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document                                 ***
****************************************************************************************************************************************************************

------------------------------------------------------------------ PETSc Performance Summary: -------------------------------------------------------------------

./ex5 on a  named head1.hpc with 1 processor, by lida Wed Jun  1 20:35:41 2022
Using Petsc Release Version 3.17.1, unknown 

                         Max       Max/Min     Avg       Total
Time (sec):           1.065e+03     1.000   1.065e+03
Objects:              7.090e+02     1.000   7.090e+02
Flops:                3.476e+11     1.000   3.476e+11  3.476e+11
Flops/sec:            3.263e+08     1.000   3.263e+08  3.263e+08
MPI Msg Count:        0.000e+00     0.000   0.000e+00  0.000e+00
MPI Msg Len (bytes):  0.000e+00     0.000   0.000e+00  0.000e+00
MPI Reductions:       0.000e+00     0.000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flop ------  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total    Count   %Total     Avg         %Total    Count   %Total
 0:      Main Stage: 3.4957e-04   0.0%  0.0000e+00   0.0%  0.000e+00   0.0%  0.000e+00        0.0%  0.000e+00   0.0%
 1:  Original Solve: 8.2717e+02  77.7%  2.5959e+11  74.7%  0.000e+00   0.0%  0.000e+00        0.0%  0.000e+00   0.0%
 2:    Second Solve: 2.3804e+02  22.3%  8.8003e+10  25.3%  0.000e+00   0.0%  0.000e+00        0.0%  0.000e+00   0.0%

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flop: Max - maximum over all processors
                  Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   AvgLen: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flop in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flop                              --- Global ---  --- Stage ----  Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   AvgLen  Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage


--- Event Stage 1: Original Solve

MatMult              460 1.0 1.5530e+02 1.0 1.05e+11 1.0 0.0e+00 0.0e+00 0.0e+00 15 30  0  0  0  19 40  0  0  0   676
MatMultAdd            91 1.0 1.7280e+01 1.0 8.00e+09 1.0 0.0e+00 0.0e+00 0.0e+00  2  2  0  0  0   2  3  0  0  0   463
MatMultTranspose      91 1.0 2.3679e+01 1.0 8.00e+09 1.0 0.0e+00 0.0e+00 0.0e+00  2  2  0  0  0   3  3  0  0  0   338
MatSolve              13 1.0 6.4134e-05 1.0 5.85e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     9
MatLUFactorSym         1 1.0 2.6981e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatLUFactorNum         1 1.0 1.5310e-05 1.0 7.30e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     5
MatConvert             7 1.0 1.2939e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   2  0  0  0  0     0
MatScale              21 1.0 3.5806e+00 1.0 2.05e+09 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0   572
MatResidual           91 1.0 2.5775e+01 1.0 1.86e+10 1.0 0.0e+00 0.0e+00 0.0e+00  2  5  0  0  0   3  7  0  0  0   723
MatAssemblyBegin      36 1.0 1.7278e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd        36 1.0 2.1752e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   3  0  0  0  0     0
MatGetRowIJ            1 1.0 5.9204e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 5.1203e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCoarsen             7 1.0 3.5076e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  3  0  0  0  0   4  0  0  0  0     0
MatAXPY                7 1.0 8.9612e+00 1.0 1.17e+08 1.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0    13
MatMatMultSym          7 1.0 1.7313e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   2  0  0  0  0     0
MatMatMultNum          7 1.0 1.0714e+01 1.0 1.74e+09 1.0 0.0e+00 0.0e+00 0.0e+00  1  1  0  0  0   1  1  0  0  0   163
MatPtAPSymbolic        7 1.0 5.7167e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  5  0  0  0  0   7  0  0  0  0     0
MatPtAPNumeric         7 1.0 5.7185e+01 1.0 7.77e+09 1.0 0.0e+00 0.0e+00 0.0e+00  5  2  0  0  0   7  3  0  0  0   136
MatTrnMatMultSym       1 1.0 3.3972e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  3  0  0  0  0   4  0  0  0  0     0
MatGetSymTrans         7 1.0 3.8600e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecMDot               82 1.0 2.0620e+01 1.0 2.84e+10 1.0 0.0e+00 0.0e+00 0.0e+00  2  8  0  0  0   2 11  0  0  0  1378
VecNorm              105 1.0 1.4910e+00 1.0 8.16e+09 1.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  3  0  0  0  5476
VecScale              90 1.0 8.9402e-01 1.0 2.58e+09 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0  2888
VecCopy              294 1.0 1.4368e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   2  0  0  0  0     0
VecSet               236 1.0 7.5047e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
VecAXPY               21 1.0 9.9549e-01 1.0 3.03e+09 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0  3047
VecAYPX              559 1.0 1.8945e+01 1.0 1.34e+10 1.0 0.0e+00 0.0e+00 0.0e+00  2  4  0  0  0   2  5  0  0  0   709
VecAXPBYCZ           182 1.0 7.3185e+00 1.0 1.52e+10 1.0 0.0e+00 0.0e+00 0.0e+00  1  4  0  0  0   1  6  0  0  0  2071
VecMAXPY             102 1.0 2.6425e+01 1.0 4.88e+10 1.0 0.0e+00 0.0e+00 0.0e+00  2 14  0  0  0   3 19  0  0  0  1845
VecAssemblyBegin       1 1.0 5.1595e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAssemblyEnd         1 1.0 1.3132e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecPointwiseMult     441 1.0 2.1981e+01 1.0 7.34e+09 1.0 0.0e+00 0.0e+00 0.0e+00  2  2  0  0  0   3  3  0  0  0   334
VecNormalize          90 1.0 1.9910e+00 1.0 7.75e+09 1.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  3  0  0  0  3891
KSPSetUp              16 1.0 7.4611e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
KSPSolve               1 1.0 2.9415e+02 1.0 2.00e+11 1.0 0.0e+00 0.0e+00 0.0e+00 28 58  0  0  0  36 77  0  0  0   680
KSPGMRESOrthog        82 1.0 3.5705e+01 1.0 5.68e+10 1.0 0.0e+00 0.0e+00 0.0e+00  3 16  0  0  0   4 22  0  0  0  1592
PCGAMGGraph_AGG        7 1.0 7.8044e+01 1.0 1.43e+09 1.0 0.0e+00 0.0e+00 0.0e+00  7  0  0  0  0   9  1  0  0  0    18
PCGAMGCoarse_AGG       7 1.0 1.2952e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 12  0  0  0  0  16  0  0  0  0     0
PCGAMGProl_AGG         7 1.0 5.1550e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  5  0  0  0  0   6  0  0  0  0     0
PCGAMGPOpt_AGG         7 1.0 1.0284e+02 1.0 4.90e+10 1.0 0.0e+00 0.0e+00 0.0e+00 10 14  0  0  0  12 19  0  0  0   476
GAMG: createProl       7 1.0 3.6476e+02 1.0 5.04e+10 1.0 0.0e+00 0.0e+00 0.0e+00 34 15  0  0  0  44 19  0  0  0   138
  Create Graph         7 1.0 1.2939e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   2  0  0  0  0     0
  Filter Graph         7 1.0 6.4028e+01 1.0 1.43e+09 1.0 0.0e+00 0.0e+00 0.0e+00  6  0  0  0  0   8  1  0  0  0    22
  MIS/Agg              7 1.0 3.5146e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  3  0  0  0  0   4  0  0  0  0     0
  SA: col data         7 1.0 7.9635e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
  SA: frmProl0         7 1.0 4.7866e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  4  0  0  0  0   6  0  0  0  0     0
  SA: smooth           7 1.0 4.2317e+01 1.0 2.47e+09 1.0 0.0e+00 0.0e+00 0.0e+00  4  1  0  0  0   5  1  0  0  0    58
GAMG: partLevel        7 1.0 1.1435e+02 1.0 7.77e+09 1.0 0.0e+00 0.0e+00 0.0e+00 11  2  0  0  0  14  3  0  0  0    68
PCGAMG Squ l00         1 1.0 3.3972e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  3  0  0  0  0   4  0  0  0  0     0
PCGAMG Gal l00         1 1.0 6.8174e+01 1.0 4.45e+09 1.0 0.0e+00 0.0e+00 0.0e+00  6  1  0  0  0   8  2  0  0  0    65
PCGAMG Opt l00         1 1.0 2.1367e+01 1.0 1.24e+09 1.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   3  0  0  0  0    58
PCGAMG Gal l01         1 1.0 3.6124e+01 1.0 2.30e+09 1.0 0.0e+00 0.0e+00 0.0e+00  3  1  0  0  0   4  1  0  0  0    64
PCGAMG Opt l01         1 1.0 5.1169e+00 1.0 3.61e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0    70
PCGAMG Gal l02         1 1.0 9.2202e+00 1.0 8.86e+08 1.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0    96
PCGAMG Opt l02         1 1.0 1.3495e+00 1.0 1.25e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    93
PCGAMG Gal l03         1 1.0 7.2320e-01 1.0 1.13e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   157
PCGAMG Opt l03         1 1.0 1.7474e-01 1.0 1.55e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    89
PCGAMG Gal l04         1 1.0 1.0741e-01 1.0 8.45e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    79
PCGAMG Opt l04         1 1.0 1.9007e-02 1.0 1.17e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    62
PCGAMG Gal l05         1 1.0 3.0491e-03 1.0 4.43e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   145
PCGAMG Opt l05         1 1.0 8.0217e-04 1.0 6.95e+04 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    87
PCGAMG Gal l06         1 1.0 1.2688e-04 1.0 1.27e+04 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   100
PCGAMG Opt l06         1 1.0 9.1779e-05 1.0 2.96e+03 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    32
PCSetUp                1 1.0 4.8375e+02 1.0 5.82e+10 1.0 0.0e+00 0.0e+00 0.0e+00 45 17  0  0  0  58 22  0  0  0   120
PCApply               13 1.0 1.9865e+02 1.0 1.18e+11 1.0 0.0e+00 0.0e+00 0.0e+00 19 34  0  0  0  24 45  0  0  0   593

--- Event Stage 2: Second Solve

MatMult              160 1.0 5.5173e+01 1.0 3.60e+10 1.0 0.0e+00 0.0e+00 0.0e+00  5 10  0  0  0  23 41  0  0  0   652
MatMultAdd            25 1.0 1.4079e+00 1.0 5.61e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
MatMultTranspose      25 1.0 1.1887e-02 1.0 5.61e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    47
MatSolve               5 1.0 3.1421e-05 1.0 1.94e+03 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    62
MatLUFactorSym         1 1.0 4.1040e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatLUFactorNum         1 1.0 2.0536e-05 1.0 8.30e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    40
MatConvert             5 1.0 9.8007e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   4  0  0  0  0     0
MatScale              15 1.0 2.0833e+00 1.0 1.00e+09 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  1  0  0  0   480
MatResidual           25 1.0 7.8182e+00 1.0 5.00e+09 1.0 0.0e+00 0.0e+00 0.0e+00  1  1  0  0  0   3  6  0  0  0   640
MatAssemblyBegin      26 1.0 4.7507e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd        26 1.0 6.1941e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   3  0  0  0  0     0
MatGetRowIJ            1 1.0 3.5968e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 3.4700e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCoarsen             5 1.0 1.7509e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   7  0  0  0  0     0
MatZeroEntries         1 1.0 4.9738e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAXPY                5 1.0 3.0948e+00 1.0 2.54e+04 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
MatMatMultSym          5 1.0 4.4847e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   2  0  0  0  0     0
MatMatMultNum          5 1.0 3.2775e+00 1.0 2.86e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
MatPtAPSymbolic        5 1.0 1.2886e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
MatPtAPNumeric         5 1.0 2.4792e+00 1.0 8.02e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
MatTrnMatMultSym       1 1.0 3.0547e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
MatGetSymTrans         5 1.0 1.3340e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecMDot               54 1.0 9.7819e+00 1.0 1.30e+10 1.0 0.0e+00 0.0e+00 0.0e+00  1  4  0  0  0   4 15  0  0  0  1329
VecNorm               67 1.0 5.1862e-01 1.0 4.60e+09 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  5  0  0  0  8870
VecScale              60 1.0 4.1836e-01 1.0 1.60e+09 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  2  0  0  0  3825
VecCopy               86 1.0 5.7537e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   2  0  0  0  0     0
VecSet                76 1.0 3.2515e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
VecAXPY               11 1.0 6.2773e-01 1.0 1.40e+09 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  2  0  0  0  2230
VecAYPX              155 1.0 7.3351e+00 1.0 4.50e+09 1.0 0.0e+00 0.0e+00 0.0e+00  1  1  0  0  0   3  5  0  0  0   614
VecAXPBYCZ            50 1.0 2.7304e+00 1.0 5.00e+09 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   1  6  0  0  0  1831
VecMAXPY              64 1.0 1.0069e+01 1.0 1.78e+10 1.0 0.0e+00 0.0e+00 0.0e+00  1  5  0  0  0   4 20  0  0  0  1768
VecPointwiseMult     155 1.0 9.0886e+00 1.0 3.10e+09 1.0 0.0e+00 0.0e+00 0.0e+00  1  1  0  0  0   4  4  0  0  0   341
VecNormalize          60 1.0 7.8309e-01 1.0 4.80e+09 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  5  0  0  0  6130
KSPSetUp              12 1.0 3.8298e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   2  0  0  0  0     0
KSPSolve               1 1.0 8.0642e+01 1.0 4.81e+10 1.0 0.0e+00 0.0e+00 0.0e+00  8 14  0  0  0  34 55  0  0  0   596
KSPGMRESOrthog        54 1.0 1.7151e+01 1.0 2.60e+10 1.0 0.0e+00 0.0e+00 0.0e+00  2  7  0  0  0   7 30  0  0  0  1516
PCGAMGGraph_AGG        5 1.0 2.7212e+01 1.0 1.00e+09 1.0 0.0e+00 0.0e+00 0.0e+00  3  0  0  0  0  11  1  0  0  0    37
PCGAMGCoarse_AGG       5 1.0 5.3094e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  5  0  0  0  0  22  0  0  0  0     0
PCGAMGProl_AGG         5 1.0 6.3314e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   3  0  0  0  0     0
PCGAMGPOpt_AGG         5 1.0 5.8035e+01 1.0 3.76e+10 1.0 0.0e+00 0.0e+00 0.0e+00  5 11  0  0  0  24 43  0  0  0   648
GAMG: createProl       5 1.0 1.4493e+02 1.0 3.86e+10 1.0 0.0e+00 0.0e+00 0.0e+00 14 11  0  0  0  61 44  0  0  0   266
  Create Graph         5 1.0 9.8007e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   4  0  0  0  0     0
  Filter Graph         5 1.0 1.6596e+01 1.0 1.00e+09 1.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   7  1  0  0  0    60
  MIS/Agg              5 1.0 1.7585e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   7  0  0  0  0     0
  SA: col data         5 1.0 6.9388e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
  SA: frmProl0         5 1.0 2.9984e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
  SA: smooth           5 1.0 1.3478e+01 1.0 4.24e+05 1.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   6  0  0  0  0     0
GAMG: partLevel        5 1.0 3.7679e+00 1.0 8.02e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   2  0  0  0  0     0
PCGAMG Squ l00         1 1.0 3.0548e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
PCGAMG Gal l00         1 1.0 3.7665e+00 1.0 6.32e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   2  0  0  0  0     0
PCGAMG Opt l00         1 1.0 7.7613e+00 1.0 2.25e+05 1.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   3  0  0  0  0     0
PCGAMG Gal l01         1 1.0 7.6855e-04 1.0 1.19e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   155
PCGAMG Opt l01         1 1.0 6.0377e-04 1.0 4.29e+04 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    71
PCGAMG Gal l02         1 1.0 3.4782e-04 1.0 3.70e+04 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   106
PCGAMG Opt l02         1 1.0 2.2165e-04 1.0 1.33e+04 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    60
PCGAMG Gal l03         1 1.0 1.3189e-04 1.0 1.08e+04 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    82
PCGAMG Opt l03         1 1.0 9.0454e-05 1.0 3.83e+03 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    42
PCGAMG Gal l04         1 1.0 6.2573e-05 1.0 3.34e+03 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    53
PCGAMG Opt l04         1 1.0 4.9897e-05 1.0 1.14e+03 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    23
PCSetUp                1 1.0 1.5265e+02 1.0 3.86e+10 1.0 0.0e+00 0.0e+00 0.0e+00 14 11  0  0  0  64 44  0  0  0   253
PCApply                5 1.0 4.8881e+01 1.0 2.90e+10 1.0 0.0e+00 0.0e+00 0.0e+00  5  8  0  0  0  21 33  0  0  0   593
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Viewer     1              1          896     0.

--- Event Stage 1: Original Solve

           Container     7              0            0     0.
              Matrix    39             23  37853744688     0.
      Matrix Coarsen     7              7         4704     0.
              Vector   249            182  42116718600     0.
       Krylov Solver    16              7       217000     0.
      Preconditioner    16              7         6496     0.
              Viewer     1              0            0     0.
         PetscRandom     7              7         4970     0.
           Index Set    12              9         8584     0.
    Distributed Mesh    15              7        35896     0.
   Star Forest Graph    30             14        16464     0.
     Discrete System    15              7         7168     0.
           Weak Form    15              7         4648     0.

--- Event Stage 2: Second Solve

           Container     5             12         7488     0.
              Matrix    28             44  42665589712     0.
      Matrix Coarsen     5              5         3360     0.
              Vector   156            223  48929583824     0.
       Krylov Solver    11             20       200246     0.
      Preconditioner    11             20        23064     0.
         PetscRandom     5              5         3550     0.
           Index Set     8             11        11264     0.
    Distributed Mesh    10             18        92304     0.
   Star Forest Graph    20             36        42336     0.
     Discrete System    10             18        18432     0.
           Weak Form    10             18        11952     0.
========================================================================================================================
Average time to get PetscTime(): 3.03611e-08
#PETSc Option Table entries:
-ksp_converged_reason
-ksp_monitor
-ksp_monitor_true_residual
-ksp_type gmres
-log_view
-m 10000
-pc_gamg_use_parallel_coarse_grid_solver
-pc_type gamg
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with 64 bit PetscInt
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8
Configure options: --with-python --prefix=/home/lida -with-mpi-dir=/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4 LDFLAGS="-L/home/lida/lib64 -L/home/lida/lib -L/home/lida/jdk/lib" CPPFLAGS="-I/home/lida/include -I/home/lida/jdk/include -march=native -O3" CXXFLAGS="-I/home/lida/include -I/home/lida/jdk/include -march=native -O3" CFLAGS="-I/home/lida/include -I/home/lida/jdk/include -march=native -O3" --with-debugging=no --with-64-bit-indices FOPTFLAGS="-O3 -march=native" --download-make
-----------------------------------------
Libraries compiled on 2022-05-25 10:03:14 on head1.hpc 
Machine characteristics: Linux-3.10.0-1062.el7.x86_64-x86_64-with-centos-7.7.1908-Core
Using PETSc directory: /home/lida
Using PETSc arch: 
-----------------------------------------

Using C compiler: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpicc -I/home/lida/include -I/home/lida/jdk/include -march=native -O3 -fPIC -I/home/lida/include -I/home/lida/jdk/include -march=native -O3  
Using Fortran compiler: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpif90  -fPIC -Wall -ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -O3 -march=native    -I/home/lida/include -I/home/lida/jdk/include -march=native -O3 
-----------------------------------------

Using include paths: -I/home/lida/include -I/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/include
-----------------------------------------

Using C linker: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpicc
Using Fortran linker: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpif90
Using libraries: -Wl,-rpath,/home/lida/lib -L/home/lida/lib -lpetsc -Wl,-rpath,/home/lida/lib64 -L/home/lida/lib64 -Wl,-rpath,/home/lida/lib -L/home/lida/lib -Wl,-rpath,/home/lida/jdk/lib -L/home/lida/jdk/lib -Wl,-rpath,/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/lib -L/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/lib -Wl,-rpath,/opt/ohpc/pub/compiler/gcc/8.3.0/lib/gcc/x86_64-pc-linux-gnu/8.3.0 -L/opt/ohpc/pub/compiler/gcc/8.3.0/lib/gcc/x86_64-pc-linux-gnu/8.3.0 -Wl,-rpath,/opt/ohpc/pub/compiler/gcc/8.3.0/lib64 -L/opt/ohpc/pub/compiler/gcc/8.3.0/lib64 -Wl,-rpath,/home/lida/intel/oneapi/mkl/2022.0.2/lib/intel64 -L/home/lida/intel/oneapi/mkl/2022.0.2/lib/intel64 -Wl,-rpath,/opt/software/intel/compilers_and_libraries_2020.2.254/linux/tbb/lib/intel64/gcc4.8 -L/opt/software/intel/compilers_and_libraries_2020.2.254/linux/tbb/lib/intel64/gcc4.8 -Wl,-rpath,/opt/ohpc/pub/compiler/gcc/8.3.0/lib -L/opt/ohpc/pub/compiler/gcc/8.3.0/lib -lopenblas -lm -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl
-----------------------------------------

[lida at head1 tutorials]$ 

-------------- next part --------------
[lida at head1 tutorials]$ export OMP_NUM_THREADS=1
[lida at head1 tutorials]$ mpirun -n 60 ./ex5 -m 10000 -ksp_monitor -ksp_monitor_true_residual -ksp_converged_reason -log_view -pc_type gamg -ksp_type gmres -pc_gamg_use_parallel_coarse_grid_solver  
--------------------------------------------------------------------------
There are not enough slots available in the system to satisfy the 60
slots that were requested by the application:

  ./ex5

Either request fewer slots for your application, or make more slots
available for use.

A "slot" is the Open MPI term for an allocatable unit where we can
launch a process.  The number of slots available are defined by the
environment in which Open MPI processes are run:

  1. Hostfile, via "slots=N" clauses (N defaults to number of
     processor cores if not provided)
  2. The --host command line parameter, via a ":N" suffix on the
     hostname (N defaults to 1 if not provided)
  3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
  4. If none of a hostfile, the --host command line parameter, or an
     RM is present, Open MPI defaults to the number of processor cores

In all the above cases, if you want Open MPI to default to the number
of hardware threads instead of the number of processor cores, use the
--use-hwthread-cpus option.

Alternatively, you can use the --oversubscribe option to ignore the
number of available slots when deciding the number of processes to
launch.
--------------------------------------------------------------------------
[lida at head1 tutorials]$ mpirun --oversubscribe -n 60 ./ex5 -m 10000 -ksp_monitor -ksp_monitor_true_residual -ksp_converged_reason -log_view -pc_type gamg -ksp_type gmres -pc_gamg_use_parallel_coarse_grid_solver  
--------------------------------------------------------------------------
WARNING: No preset parameters were found for the device that Open MPI
detected:

  Local host:            head1
  Device name:           i40iw0
  Device vendor ID:      0x8086
  Device vendor part ID: 14290

Default device parameters will be used, which may result in lower
performance.  You can edit any of the files specified by the
btl_openib_device_param_files MCA parameter to set values for your
device.

NOTE: You can turn off this warning by setting the MCA parameter
      btl_openib_warn_no_device_params_found to 0.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
No OpenFabrics connection schemes reported that they were able to be
used on a specific port.  As such, the openib BTL (OpenFabrics
support) will be disabled for this port.

  Local host:           head1
  Local device:         i40iw0
  Local port:           1
  CPCs attempted:       rdmacm, udcm
--------------------------------------------------------------------------
[head1.hpc:19648] 119 more processes have sent help message help-mpi-btl-openib.txt / no device params found
[head1.hpc:19648] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
[head1.hpc:19648] 119 more processes have sent help message help-mpi-btl-openib-cpc-base.txt / no cpcs for port
  0 KSP Residual norm 4.355207026627e+09 
  0 KSP preconditioned resid norm 4.355207026627e+09 true resid norm 1.823690908212e+09 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP Residual norm 1.208748211681e+09 
  1 KSP preconditioned resid norm 1.208748211681e+09 true resid norm 6.155480046246e+08 ||r(i)||/||b|| 3.375286907736e-01
  2 KSP Residual norm 4.517088284383e+08 
  2 KSP preconditioned resid norm 4.517088284383e+08 true resid norm 3.886324242477e+08 ||r(i)||/||b|| 2.131021339732e-01
  3 KSP Residual norm 1.603295575511e+08 
  3 KSP preconditioned resid norm 1.603295575511e+08 true resid norm 1.620131718656e+08 ||r(i)||/||b|| 8.883806523136e-02
  4 KSP Residual norm 5.544067857339e+07 
  4 KSP preconditioned resid norm 5.544067857339e+07 true resid norm 6.120149859376e+07 ||r(i)||/||b|| 3.355914004843e-02
  5 KSP Residual norm 1.925294565414e+07 
  5 KSP preconditioned resid norm 1.925294565414e+07 true resid norm 2.393555310461e+07 ||r(i)||/||b|| 1.312478611196e-02
  6 KSP Residual norm 5.919031529729e+06 
  6 KSP preconditioned resid norm 5.919031529729e+06 true resid norm 6.600310389533e+06 ||r(i)||/||b|| 3.619204526277e-03
  7 KSP Residual norm 2.132637762468e+06 
  7 KSP preconditioned resid norm 2.132637762468e+06 true resid norm 2.301013174864e+06 ||r(i)||/||b|| 1.261734192183e-03
  8 KSP Residual norm 7.288135118024e+05 
  8 KSP preconditioned resid norm 7.288135118024e+05 true resid norm 8.376703989009e+05 ||r(i)||/||b|| 4.593269589318e-04
  9 KSP Residual norm 2.618419345570e+05 
  9 KSP preconditioned resid norm 2.618419345570e+05 true resid norm 2.924464805008e+05 ||r(i)||/||b|| 1.603596745390e-04
 10 KSP Residual norm 9.736460918466e+04 
 10 KSP preconditioned resid norm 9.736460918466e+04 true resid norm 1.093493729815e+05 ||r(i)||/||b|| 5.996047492975e-05
 11 KSP Residual norm 3.616464600646e+04 
 11 KSP preconditioned resid norm 3.616464600646e+04 true resid norm 4.287951581559e+04 ||r(i)||/||b|| 2.351249086262e-05
time 13.250000 m=10000 n=10000
Linear solve converged due to CONVERGED_RTOL iterations 11
time 10.790000 m=10000 n=10000
time 14.340000 m=10000 n=10000
time 11.910000 m=10000 n=10000
time 12.670000 m=10000 n=10000
time 14.900000 m=10000 n=10000
time 13.860000 m=10000 n=10000
time 12.550000 m=10000 n=10000
time 12.610000 m=10000 n=10000
time 11.110000 m=10000 n=10000
time 11.640000 m=10000 n=10000
time 11.980000 m=10000 n=10000
time 14.820000 m=10000 n=10000
time 16.180000 m=10000 n=10000
time 14.330000 m=10000 n=10000
time 13.850000 m=10000 n=10000
time 13.220000 m=10000 n=10000
time 10.220000 m=10000 n=10000
time 12.680000 m=10000 n=10000
time 16.240000 m=10000 n=10000
time 12.490000 m=10000 n=10000
time 16.070000 m=10000 n=10000
time 12.870000 m=10000 n=10000
time 12.170000 m=10000 n=10000
time 15.960000 m=10000 n=10000
time 13.630000 m=10000 n=10000
time 11.530000 m=10000 n=10000
time 13.700000 m=10000 n=10000
time 14.360000 m=10000 n=10000
time 11.690000 m=10000 n=10000
time 13.610000 m=10000 n=10000
time 12.800000 m=10000 n=10000
time 10.350000 m=10000 n=10000
time 14.680000 m=10000 n=10000
time 12.640000 m=10000 n=10000
time 10.860000 m=10000 n=10000
time 13.650000 m=10000 n=10000
time 14.190000 m=10000 n=10000
time 12.620000 m=10000 n=10000
time 12.860000 m=10000 n=10000
time 13.640000 m=10000 n=10000
time 14.790000 m=10000 n=10000
time 11.720000 m=10000 n=10000
time 13.300000 m=10000 n=10000
time 12.990000 m=10000 n=10000
time 13.100000 m=10000 n=10000
time 14.630000 m=10000 n=10000
time 14.170000 m=10000 n=10000
time 13.830000 m=10000 n=10000
time 12.600000 m=10000 n=10000
time 12.500000 m=10000 n=10000
time 12.050000 m=10000 n=10000
time 13.430000 m=10000 n=10000
time 11.790000 m=10000 n=10000
time 12.900000 m=10000 n=10000
time 11.200000 m=10000 n=10000
time 14.120000 m=10000 n=10000
time 15.230000 m=10000 n=10000
time 14.020000 m=10000 n=10000
time 13.360000 m=10000 n=10000
Norm of error 126115., Iterations 11
  0 KSP Residual norm 1.051609452779e+09 
  0 KSP preconditioned resid norm 1.051609452779e+09 true resid norm 2.150197987965e+09 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP Residual norm 1.140665240186e+07 
  1 KSP preconditioned resid norm 1.140665240186e+07 true resid norm 7.877021908575e+07 ||r(i)||/||b|| 3.663393767766e-02
  2 KSP Residual norm 9.303562428258e+05 
  2 KSP preconditioned resid norm 9.303562428258e+05 true resid norm 5.522945877123e+06 ||r(i)||/||b|| 2.568575502366e-03
  3 KSP Residual norm 7.562974008642e+04 
  3 KSP preconditioned resid norm 7.562974008642e+04 true resid norm 4.308267545679e+05 ||r(i)||/||b|| 2.003660858113e-04
  4 KSP Residual norm 6.241321855425e+03 
  4 KSP preconditioned resid norm 6.241321855425e+03 true resid norm 3.569774197924e+04 ||r(i)||/||b|| 1.660207207850e-05
Linear solve converged due to CONVERGED_RTOL iterations 4
Norm of error 9.5902e+09, Iterations 4
**************************************** ***********************************************************************************************************************
***                                WIDEN YOUR WINDOW TO 160 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document                                 ***
****************************************************************************************************************************************************************

------------------------------------------------------------------ PETSc Performance Summary: -------------------------------------------------------------------

./ex5 on a  named head1.hpc with 60 processors, by lida Wed Jun  1 20:39:05 2022
Using Petsc Release Version 3.17.1, unknown 

                         Max       Max/Min     Avg       Total
Time (sec):           8.038e+01     1.001   8.036e+01
Objects:              1.450e+03     1.000   1.450e+03
Flops:                5.486e+09     1.002   5.485e+09  3.291e+11
Flops/sec:            6.827e+07     1.003   6.825e+07  4.095e+09
MPI Msg Count:        3.320e+03     2.712   2.535e+03  1.521e+05
MPI Msg Len (bytes):  5.412e+07     1.926   2.092e+04  3.183e+09
MPI Reductions:       1.547e+03     1.000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flop ------  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total    Count   %Total     Avg         %Total    Count   %Total
 0:      Main Stage: 2.1349e-01   0.3%  0.0000e+00   0.0%  0.000e+00   0.0%  0.000e+00        0.0%  2.000e+00   0.1%
 1:  Original Solve: 5.8264e+01  72.5%  2.4065e+11  73.1%  1.086e+05  71.4%  2.402e+04       82.0%  9.080e+02  58.7%
 2:    Second Solve: 2.1880e+01  27.2%  8.8426e+10  26.9%  4.348e+04  28.6%  1.318e+04       18.0%  6.190e+02  40.0%

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flop: Max - maximum over all processors
                  Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   AvgLen: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flop in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flop                              --- Global ---  --- Stage ----  Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   AvgLen  Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage


--- Event Stage 1: Original Solve

BuildTwoSided         99 1.0 5.4267e+00 1.8 0.00e+00 0.0 5.5e+03 8.0e+00 9.9e+01  5  0  4  0  6   7  0  5  0 11     0
BuildTwoSidedF        59 1.0 4.8310e+00 2.2 0.00e+00 0.0 2.8e+03 7.3e+04 5.9e+01  4  0  2  6  4   6  0  3  8  6     0
MatMult              430 1.0 1.0317e+01 1.3 1.64e+09 1.0 5.2e+04 2.8e+04 7.0e+00 11 30 34 46  0  16 41 48 56  1  9505
MatMultAdd            84 1.0 1.1148e+00 1.8 1.23e+08 1.0 7.7e+03 8.8e+03 0.0e+00  1  2  5  2  0   2  3  7  3  0  6638
MatMultTranspose      84 1.0 1.9097e+00 1.7 1.24e+08 1.0 9.0e+03 8.2e+03 7.0e+00  2  2  6  2  0   2  3  8  3  1  3880
MatSolve              12 1.0 8.0690e-05 3.1 3.36e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   250
MatLUFactorSym         1 1.0 6.1153e-05 5.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatLUFactorNum         1 1.0 5.7829e-05 5.0 3.60e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    37
MatConvert             8 1.0 7.6993e-01 1.9 0.00e+00 0.0 1.6e+03 1.3e+04 7.0e+00  1  0  1  1  0   1  0  1  1  1     0
MatScale              21 1.0 3.2515e-01 2.0 3.42e+07 1.0 8.1e+02 2.6e+04 0.0e+00  0  1  1  1  0   0  1  1  1  0  6310
MatResidual           84 1.0 1.8980e+00 1.5 2.87e+08 1.0 9.8e+03 2.6e+04 0.0e+00  2  5  6  8  0   3  7  9 10  0  9073
MatAssemblyBegin     117 1.0 4.0990e+00 2.4 0.00e+00 0.0 2.8e+03 7.3e+04 4.0e+01  3  0  2  6  3   5  0  3  8  4     0
MatAssemblyEnd       117 1.0 6.4010e+00 1.4 1.21e+05 2.3 0.0e+00 0.0e+00 1.4e+02  7  0  0  0  9   9  0  0  0 15     1
MatGetRowIJ            1 1.0 5.1278e-0513.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCreateSubMats       1 1.0 2.7949e-02 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCreateSubMat        4 1.0 1.4416e-01 1.1 0.00e+00 0.0 2.9e+02 1.4e+03 5.6e+01  0  0  0  0  4   0  0  0  0  6     0
MatGetOrdering         1 1.0 1.2932e-02469.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCoarsen             7 1.0 1.8915e+00 1.2 0.00e+00 0.0 2.0e+04 8.6e+03 9.5e+01  2  0 13  5  6   3  0 19  7 10     0
MatZeroEntries         7 1.0 2.4975e-02 6.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAXPY                7 1.0 1.1375e+00 1.2 1.95e+06 1.0 0.0e+00 0.0e+00 7.0e+00  1  0  0  0  0   2  0  0  0  1   103
MatTranspose          14 1.0 3.0706e-01 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMatMultSym         21 1.0 3.4042e+00 1.1 0.00e+00 0.0 2.4e+03 2.6e+04 6.3e+01  4  0  2  2  4   5  0  2  2  7     0
MatMatMultNum         21 1.0 1.7535e+00 1.6 8.62e+07 1.0 8.1e+02 2.6e+04 7.0e+00  2  2  1  1  0   2  2  1  1  1  2946
MatPtAPSymbolic        7 1.0 3.8899e+00 1.0 0.00e+00 0.0 5.3e+03 4.9e+04 4.9e+01  5  0  3  8  3   7  0  5 10  5     0
MatPtAPNumeric         7 1.0 2.3929e+00 1.1 1.34e+08 1.0 1.9e+03 1.0e+05 3.5e+01  3  2  1  6  2   4  3  2  7  4  3361
MatTrnMatMultSym       1 1.0 5.1090e+00 1.0 0.00e+00 0.0 3.5e+02 1.9e+05 1.1e+01  6  0  0  2  1   9  0  0  3  1     0
MatRedundantMat        1 1.0 2.8010e-02 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMPIConcateSeq       1 1.0 3.4418e-05 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetLocalMat        22 1.0 1.1056e+00 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
MatGetBrAoCol         21 1.0 2.0194e-01 2.8 0.00e+00 0.0 5.7e+03 4.7e+04 0.0e+00  0  0  4  8  0   0  0  5 10  0     0
MatGetSymTrans         2 1.0 2.8332e-01 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecMDot               81 1.0 3.5909e+00 1.5 4.34e+08 1.0 0.0e+00 0.0e+00 8.1e+01  4  8  0  0  5   5 11  0  0  9  7248
VecNorm              103 1.0 2.5986e+00 1.5 1.29e+08 1.0 0.0e+00 0.0e+00 1.0e+02  3  2  0  0  7   4  3  0  0 11  2988
VecScale              89 1.0 1.6507e-01 1.8 4.14e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 15040
VecCopy              272 1.0 5.8546e-01 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
VecSet               315 1.0 3.5993e-01 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY               20 1.0 1.9678e-01 2.2 4.72e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 14398
VecAYPX              516 1.0 8.4424e-01 1.9 2.07e+08 1.0 0.0e+00 0.0e+00 0.0e+00  1  4  0  0  0   1  5  0  0  0 14681
VecAXPBYCZ           168 1.0 3.6343e-01 2.1 2.33e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  4  0  0  0   0  6  0  0  0 38501
VecMAXPY             100 1.0 1.6303e+00 1.4 7.30e+08 1.0 0.0e+00 0.0e+00 0.0e+00  2 13  0  0  0   2 18  0  0  0 26841
VecAssemblyBegin      21 1.0 7.7428e-01 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.9e+01  1  0  0  0  1   1  0  0  0  2     0
VecAssemblyEnd        21 1.0 8.0603e-05 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecPointwiseMult     413 1.0 8.0314e-01 1.5 1.15e+08 1.0 0.0e+00 0.0e+00 0.0e+00  1  2  0  0  0   1  3  0  0  0  8566
VecScatterBegin      650 1.0 5.5847e-01 2.3 0.00e+00 0.0 7.4e+04 2.4e+04 2.6e+01  1  0 49 57  2   1  0 68 69  3     0
VecScatterEnd        650 1.0 6.7772e+00 2.7 1.47e+05 2.0 0.0e+00 0.0e+00 0.0e+00  6  0  0  0  0   8  0  0  0  0     1
VecNormalize          89 1.0 1.7682e+00 1.4 1.24e+08 1.0 0.0e+00 0.0e+00 8.9e+01  2  2  0  0  6   3  3  0  0 10  4212
SFSetGraph            52 1.0 4.5486e-03 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFSetUp               40 1.0 1.0026e+00 2.2 0.00e+00 0.0 8.3e+03 1.4e+04 4.0e+01  1  0  5  4  3   1  0  8  4  4     0
SFBcastBegin         102 1.0 1.1024e-0214.9 0.00e+00 0.0 1.9e+04 7.7e+03 0.0e+00  0  0 12  5  0   0  0 17  6  0     0
SFBcastEnd           102 1.0 4.4157e-0110.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFPack               752 1.0 3.4799e-02 8.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFUnpack             752 1.0 1.5714e-03 2.5 1.47e+05 2.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  5420
KSPSetUp              17 1.0 4.5984e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.6e+01  0  0  0  0  1   1  0  0  0  2     0
KSPSolve               1 1.0 1.7258e+01 1.0 3.02e+09 1.0 5.9e+04 2.3e+04 5.9e+01 21 55 39 43  4  30 75 55 52  6 10498
KSPGMRESOrthog        81 1.0 4.4162e+00 1.4 8.68e+08 1.0 0.0e+00 0.0e+00 8.1e+01  5 16  0  0  5   6 22  0  0  9 11787
PCGAMGGraph_AGG        7 1.0 6.6074e+00 1.0 2.39e+07 1.0 2.4e+03 1.7e+04 6.3e+01  8  0  2  1  4  11  1  2  2  7   217
PCGAMGCoarse_AGG       7 1.0 9.2609e+00 1.0 0.00e+00 0.0 2.2e+04 1.6e+04 1.2e+02 11  0 14 11  8  16  0 20 14 13     0
PCGAMGProl_AGG         7 1.0 3.7914e+00 1.1 0.00e+00 0.0 3.9e+03 2.2e+04 1.1e+02  5  0  3  3  7   6  0  4  3 12     0
PCGAMGPOpt_AGG         7 1.0 9.7580e+00 1.0 8.12e+08 1.0 1.3e+04 2.4e+04 2.9e+02 12 15  8 10 19  17 20 12 12 32  4991
GAMG: createProl       7 1.0 2.9629e+01 1.0 8.36e+08 1.0 4.1e+04 1.9e+04 5.8e+02 37 15 27 25 37  51 21 38 30 64  1692
  Create Graph         7 1.0 7.6994e-01 1.9 0.00e+00 0.0 1.6e+03 1.3e+04 7.0e+00  1  0  1  1  0   1  0  1  1  1     0
  Filter Graph         7 1.0 5.9877e+00 1.1 2.39e+07 1.0 8.1e+02 2.6e+04 5.6e+01  7  0  1  1  4  10  1  1  1  6   240
  MIS/Agg              7 1.0 1.8917e+00 1.2 0.00e+00 0.0 2.0e+04 8.6e+03 9.5e+01  2  0 13  5  6   3  0 19  7 10     0
  SA: col data         7 1.0 8.5757e-01 1.1 0.00e+00 0.0 3.0e+03 2.4e+04 4.8e+01  1  0  2  2  3   1  0  3  3  5     0
  SA: frmProl0         7 1.0 2.5043e+00 1.0 0.00e+00 0.0 9.2e+02 1.5e+04 3.5e+01  3  0  1  0  2   4  0  1  1  4     0
  SA: smooth           7 1.0 4.9217e+00 1.0 3.62e+07 1.0 3.3e+03 2.6e+04 9.4e+01  6  1  2  3  6   8  1  3  3 10   441
GAMG: partLevel        7 1.0 6.8353e+00 1.0 1.34e+08 1.0 7.9e+03 5.7e+04 1.9e+02  8  2  5 14 12  12  3  7 17 21  1177
  repartition          2 1.0 4.0880e-01 1.0 0.00e+00 0.0 7.0e+02 6.2e+02 1.1e+02  1  0  0  0  7   1  0  1  0 12     0
  Invert-Sort          2 1.0 6.3940e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  1   0  0  0  0  1     0
  Move A               2 1.0 1.2997e-01 1.1 0.00e+00 0.0 2.9e+02 1.4e+03 3.0e+01  0  0  0  0  2   0  0  0  0  3     0
  Move P               2 1.0 2.4461e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 3.2e+01  0  0  0  0  2   0  0  0  0  4     0
PCGAMG Squ l00         1 1.0 5.1090e+00 1.0 0.00e+00 0.0 3.5e+02 1.9e+05 1.1e+01  6  0  0  2  1   9  0  0  3  1     0
PCGAMG Gal l00         1 1.0 3.9555e+00 1.1 7.69e+07 1.0 9.4e+02 1.5e+05 1.3e+01  5  1  1  4  1   7  2  1  5  1  1167
PCGAMG Opt l00         1 1.0 2.4544e+00 1.1 1.67e+07 1.0 4.7e+02 8.0e+04 1.1e+01  3  0  0  1  1   4  0  0  1  1   407
PCGAMG Gal l01         1 1.0 1.6669e+00 1.0 3.97e+07 1.0 9.4e+02 1.8e+05 1.3e+01  2  1  1  5  1   3  1  1  6  1  1428
PCGAMG Opt l01         1 1.0 5.6540e-01 1.0 5.09e+06 1.0 4.7e+02 5.0e+04 1.1e+01  1  0  0  1  1   1  0  0  1  1   540
PCGAMG Gal l02         1 1.0 4.9954e-01 1.0 1.54e+07 1.1 9.4e+02 1.1e+05 1.3e+01  1  0  1  3  1   1  0  1  4  1  1829
PCGAMG Opt l02         1 1.0 2.3346e-01 1.1 1.91e+06 1.0 4.7e+02 3.2e+04 1.1e+01  0  0  0  0  1   0  0  0  1  1   488
PCGAMG Gal l03         1 1.0 2.1878e-01 1.0 2.19e+06 1.4 9.4e+02 3.6e+04 1.2e+01  0  0  1  1  1   0  0  1  1  1   567
PCGAMG Opt l03         1 1.0 1.3572e-01 1.1 2.49e+05 1.1 4.7e+02 1.1e+04 1.0e+01  0  0  0  0  1   0  0  0  0  1   108
PCGAMG Gal l04         1 1.0 1.1188e-01 1.2 1.95e+05 2.2 2.4e+03 3.6e+03 1.2e+01  0  0  2  0  1   0  0  2  0  1    93
PCGAMG Opt l04         1 1.0 1.3686e-01 1.1 2.16e+04 1.7 9.4e+02 1.6e+03 1.0e+01  0  0  1  0  1   0  0  1  0  1     9
PCGAMG Gal l05         1 1.0 3.8335e-03 1.4 3.21e+04 0.0 9.5e+02 5.7e+02 1.2e+01  0  0  1  0  1   0  0  1  0  1   133
PCGAMG Opt l05         1 1.0 1.3979e-02 1.1 4.02e+03 0.0 4.3e+02 2.7e+02 1.0e+01  0  0  0  0  1   0  0  0  0  1     5
PCGAMG Gal l06         1 1.0 2.1886e-03 1.5 8.55e+03 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  1   0  0  0  0  1     4
PCGAMG Opt l06         1 1.0 1.5361e-02 1.0 2.63e+03 0.0 0.0e+00 0.0e+00 1.0e+01  0  0  0  0  1   0  0  0  0  1     0
PCSetUp                1 1.0 3.7111e+01 1.0 9.70e+08 1.0 4.9e+04 2.5e+04 8.3e+02 46 18 32 39 54  64 24 45 47 92  1568
PCApply               12 1.0 1.1146e+01 1.1 1.82e+09 1.0 5.7e+04 2.0e+04 2.3e+01 13 33 37 36  1  18 45 52 44  3  9768

--- Event Stage 2: Second Solve

BuildTwoSided         73 1.0 1.9759e+00 1.9 0.00e+00 0.0 6.8e+03 8.0e+00 7.3e+01  2  0  4  0  5   7  0 16  0 12     0
BuildTwoSidedF        44 1.0 1.2189e+00 2.0 0.00e+00 0.0 8.6e+03 1.7e+04 4.4e+01  1  0  6  5  3   4  0 20 25  7     0
MatMult              160 1.0 4.2652e+00 1.4 6.02e+08 1.0 1.6e+04 2.3e+04 4.0e+00  5 11 11 12  0  17 41 38 67  1  8474
MatMultAdd            25 1.0 1.8994e-01 3.6 6.17e+05 1.0 2.1e+03 1.7e+02 0.0e+00  0  0  1  0  0   0  0  5  0  0   193
MatMultTranspose      25 1.0 7.2792e-0119.4 6.18e+05 1.0 2.9e+03 1.5e+02 5.0e+00  0  0  2  0  0   1  0  7  0  1    51
MatSolve               5 1.0 3.6131e-05 1.8 1.99e+03 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  3305
MatLUFactorSym         1 1.0 9.1382e-05 5.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatLUFactorNum         1 1.0 3.8696e-05 5.9 9.09e+02 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1409
MatConvert             6 1.0 4.6379e-01 2.2 0.00e+00 0.0 9.6e+02 1.0e+04 5.0e+00  0  0  1  0  0   1  0  2  2  1     0
MatScale              15 1.0 2.0882e-01 2.2 1.69e+07 1.0 4.8e+02 2.0e+04 0.0e+00  0  0  0  0  0   1  1  1  2  0  4848
MatResidual           25 1.0 6.9403e-01 2.1 8.38e+07 1.0 2.4e+03 2.0e+04 0.0e+00  1  2  2  2  0   2  6  6  8  0  7240
MatAssemblyBegin      87 1.0 8.9302e-01 2.2 0.00e+00 0.0 8.6e+03 1.7e+04 3.0e+01  1  0  6  5  2   3  0 20 25  5     0
MatAssemblyEnd        87 1.0 2.6614e+00 1.3 9.98e+04 1.0 0.0e+00 0.0e+00 1.0e+02  3  0  0  0  7  11  0  0  0 16     2
MatGetRowIJ            1 1.0 1.2436e-021825.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCreateSubMats       1 1.0 5.4925e-02 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  1     0
MatCreateSubMat        4 1.0 3.1275e-01 1.1 0.00e+00 0.0 3.0e+02 2.0e+02 5.6e+01  0  0  0  0  4   1  0  1  0  9     0
MatGetOrdering         1 1.0 1.2514e-02516.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCoarsen             5 1.0 1.0904e+00 1.8 0.00e+00 0.0 2.9e+03 6.3e+02 1.5e+01  1  0  2  0  1   4  0  7  0  2     0
MatZeroEntries         6 1.0 5.8985e-02 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAXPY                5 1.0 4.8172e-01 1.2 2.35e+04 1.0 0.0e+00 0.0e+00 5.0e+00  1  0  0  0  0   2  0  0  0  1     3
MatTranspose          10 1.0 3.2074e-02 3.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMatMultSym         15 1.0 1.1367e+00 1.0 0.00e+00 0.0 1.4e+03 7.0e+03 4.5e+01  1  0  1  0  3   5  0  3  2  7     0
MatMatMultNum         15 1.0 2.4953e-01 1.2 1.02e+06 1.0 4.8e+02 5.1e+02 5.0e+00  0  0  0  0  0   1  0  1  0  1   241
MatPtAPSymbolic        5 1.0 8.1101e-01 1.1 0.00e+00 0.0 2.9e+03 4.2e+03 3.5e+01  1  0  2  0  2   4  0  7  2  6     0
MatPtAPNumeric         5 1.0 3.9561e-01 1.2 1.58e+06 1.0 9.6e+02 2.2e+03 2.5e+01  0  0  1  0  2   2  0  2  0  4   235
MatTrnMatMultSym       1 1.0 9.9313e-01 1.0 0.00e+00 0.0 3.5e+02 2.2e+03 1.1e+01  1  0  0  0  1   4  0  1  0  2     0
MatRedundantMat        1 1.0 5.4990e-02 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  1     0
MatMPIConcateSeq       1 1.0 5.8523e-05 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetLocalMat        16 1.0 5.0405e-01 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   2  0  0  0  0     0
MatGetBrAoCol         15 1.0 1.0643e-01 2.6 0.00e+00 0.0 3.4e+03 6.5e+03 0.0e+00  0  0  2  1  0   0  0  8  4  0     0
MatGetSymTrans         2 1.0 7.1719e-02 4.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecMDot               54 1.0 1.5885e+00 1.7 2.17e+08 1.0 0.0e+00 0.0e+00 5.4e+01  2  4  0  0  3   6 15  0  0  9  8198
VecNorm               67 1.0 1.6017e+00 1.4 7.67e+07 1.0 0.0e+00 0.0e+00 6.7e+01  2  1  0  0  4   6  5  0  0 11  2875
VecScale              60 1.0 1.0996e-01 1.7 2.67e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  2  0  0  0 14571
VecCopy               86 1.0 2.5385e-01 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
VecSet               106 1.0 1.8434e-01 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY               11 1.0 9.2483e-02 2.2 2.33e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  2  0  0  0 15142
VecAYPX              155 1.0 3.6418e-01 2.3 7.51e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   1  5  0  0  0 12379
VecAXPBYCZ            50 1.0 1.6760e-01 2.9 8.35e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  6  0  0  0 29894
VecMAXPY              64 1.0 6.9542e-01 1.6 2.97e+08 1.0 0.0e+00 0.0e+00 0.0e+00  1  5  0  0  0   3 20  0  0  0 25634
VecAssemblyBegin      16 1.0 5.5466e-01 3.7 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01  0  0  0  0  1   2  0  0  0  2     0
VecAssemblyEnd        16 1.0 7.5968e-05 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecPointwiseMult     155 1.0 3.8336e-01 1.8 5.18e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   1  4  0  0  0  8103
VecScatterBegin      242 1.0 3.2776e-01 2.7 0.00e+00 0.0 2.5e+04 1.6e+04 1.9e+01  0  0 17 12  1   1  0 58 69  3     0
VecScatterEnd        242 1.0 2.3472e+00 2.9 1.22e+03 3.5 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   7  0  0  0  0     0
VecNormalize          60 1.0 1.3091e+00 1.6 8.01e+07 1.0 0.0e+00 0.0e+00 6.0e+01  1  1  0  0  4   5  5  0  0 10  3672
SFSetGraph            39 1.0 8.6654e-04 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFSetUp               29 1.0 8.1719e-01 2.8 0.00e+00 0.0 4.9e+03 2.2e+03 2.9e+01  1  0  3  0  2   3  0 11  2  5     0
SFBcastBegin          20 1.0 2.9462e-04 2.5 0.00e+00 0.0 1.9e+03 7.5e+02 0.0e+00  0  0  1  0  0   0  0  4  0  0     0
SFBcastEnd            20 1.0 2.5962e-01143.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFPack               262 1.0 1.7704e-02100.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFUnpack             262 1.0 1.3156e-02131.3 1.22e+03 3.5 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     3
KSPSetUp              13 1.0 1.7844e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01  0  0  0  0  1   1  0  0  0  2     0
KSPSolve               1 1.0 5.4889e+00 1.0 8.05e+08 1.0 1.6e+04 1.7e+04 3.2e+01  7 15 11  9  2  25 55 37 48  5  8798
KSPGMRESOrthog        54 1.0 1.9637e+00 1.4 4.34e+08 1.0 0.0e+00 0.0e+00 5.4e+01  2  8  0  0  3   7 29  0  0  9 13263
PCGAMGGraph_AGG        5 1.0 2.2872e+00 1.0 1.68e+07 1.0 1.4e+03 1.3e+04 4.5e+01  3  0  1  1  3  10  1  3  3  7   439
PCGAMGCoarse_AGG       5 1.0 3.2731e+00 1.0 0.00e+00 0.0 4.3e+03 9.2e+02 3.7e+01  4  0  3  0  2  15  0 10  1  6     0
PCGAMGProl_AGG         5 1.0 1.2727e+00 1.0 0.00e+00 0.0 2.3e+03 4.5e+02 7.9e+01  2  0  1  0  5   6  0  5  0 13     0
PCGAMGPOpt_AGG         5 1.0 5.7655e+00 1.0 6.29e+08 1.0 7.4e+03 1.4e+04 2.0e+02  7 11  5  3 13  26 43 17 19 33  6544
GAMG: createProl       5 1.0 1.2524e+01 1.0 6.46e+08 1.0 1.5e+04 8.4e+03 3.7e+02 16 12 10  4 24  57 44 36 23 59  3093
  Create Graph         5 1.0 4.6379e-01 2.2 0.00e+00 0.0 9.6e+02 1.0e+04 5.0e+00  0  0  1  0  0   1  0  2  2  1     0
  Filter Graph         5 1.0 2.0622e+00 1.1 1.68e+07 1.0 4.8e+02 2.0e+04 4.0e+01  2  0  0  0  3   9  1  1  2  6   487
  MIS/Agg              5 1.0 1.0905e+00 1.8 0.00e+00 0.0 2.9e+03 6.3e+02 1.5e+01  1  0  2  0  1   4  0  7  0  2     0
  SA: col data         5 1.0 5.3167e-01 1.2 0.00e+00 0.0 1.7e+03 5.1e+02 3.4e+01  1  0  1  0  2   2  0  4  0  5     0
  SA: frmProl0         5 1.0 5.6108e-01 1.1 0.00e+00 0.0 6.0e+02 2.8e+02 2.5e+01  1  0  0  0  2   2  0  1  0  4     0
  SA: smooth           5 1.0 2.0259e+00 1.1 4.34e+05 1.0 1.9e+03 5.4e+03 6.6e+01  2  0  1  0  4   9  0  4  2 11    13
GAMG: partLevel        5 1.0 1.8885e+00 1.0 1.58e+06 1.0 4.5e+03 3.2e+03 1.7e+02  2  0  3  0 11   9  0 10  3 27    49
  repartition          2 1.0 7.2098e-01 1.0 0.00e+00 0.0 6.7e+02 1.0e+02 1.1e+02  1  0  0  0  7   3  0  2  0 17     0
  Invert-Sort          2 1.0 7.2351e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  1   0  0  0  0  2     0
  Move A               2 1.0 1.6566e-01 1.2 0.00e+00 0.0 3.0e+02 2.0e+02 3.0e+01  0  0  0  0  2   1  0  1  0  5     0
  Move P               2 1.0 1.9534e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 3.2e+01  0  0  0  0  2   1  0  0  0  5     0
PCGAMG Squ l00         1 1.0 9.9313e-01 1.0 0.00e+00 0.0 3.5e+02 2.2e+03 1.1e+01  1  0  0  0  1   4  0  1  0  2     0
PCGAMG Gal l00         1 1.0 7.0349e-01 1.0 9.22e+05 1.0 9.4e+02 1.2e+04 1.3e+01  1  0  1  0  1   3  0  2  2  2    78
PCGAMG Opt l00         1 1.0 1.1336e+00 1.1 2.00e+05 1.0 4.7e+02 2.1e+04 1.1e+01  1  0  0  0  1   5  0  1  2  2    11
PCGAMG Gal l01         1 1.0 1.3150e-01 1.1 4.76e+05 1.1 9.4e+02 2.1e+03 1.2e+01  0  0  1  0  1   1  0  2  0  2   210
PCGAMG Opt l01         1 1.0 1.3884e-01 1.1 6.11e+04 1.0 4.7e+02 6.0e+02 1.0e+01  0  0  0  0  1   1  0  1  0  2    26
PCGAMG Gal l02         1 1.0 1.7012e-01 1.1 1.76e+05 1.3 9.4e+02 1.2e+03 1.2e+01  0  0  1  0  1   1  0  2  0  2    57
PCGAMG Opt l02         1 1.0 3.7929e-02 1.0 2.31e+04 1.1 4.7e+02 3.9e+02 1.0e+01  0  0  0  0  1   0  0  1  0  2    35
PCGAMG Gal l03         1 1.0 1.2656e-01 1.1 1.65e+04 1.6 9.4e+02 2.8e+02 1.2e+01  0  0  1  0  1   1  0  2  0  2     7
PCGAMG Opt l03         1 1.0 1.6197e-03 1.3 2.75e+03 1.4 4.7e+02 1.4e+02 1.0e+01  0  0  0  0  1   0  0  1  0  2    90
PCGAMG Gal l04         1 1.0 7.2816e-02 1.4 4.91e+03 0.0 6.4e+01 6.3e+01 1.2e+01  0  0  0  0  1   0  0  0  0  2     0
PCGAMG Opt l04         1 1.0 1.1088e-01 1.2 1.50e+03 0.0 3.2e+01 5.5e+01 1.0e+01  0  0  0  0  1   0  0  0  0  2     0
PCSetUp                1 1.0 1.5702e+01 1.0 6.47e+08 1.0 2.0e+04 7.2e+03 5.8e+02 19 12 13  5 38  72 44 46 25 94  2473
PCApply                5 1.0 3.6198e+00 1.2 4.87e+08 1.0 1.5e+04 1.3e+04 1.7e+01  4  9 10  6  1  15 33 35 34  3  8064
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Viewer     1              1          896     0.

--- Event Stage 1: Original Solve

           Container    14              4         2496     0.
              Matrix   195            108   1577691960     0.
      Matrix Coarsen     7              7         4704     0.
              Vector   352            266    734665864     0.
           Index Set   110             97       704896     0.
   Star Forest Graph    82             49        63224     0.
       Krylov Solver    17              7       217000     0.
      Preconditioner    17              7         6496     0.
              Viewer     1              0            0     0.
         PetscRandom     7              7         4970     0.
    Distributed Mesh    15              7        35896     0.
     Discrete System    15              7         7168     0.
           Weak Form    15              7         4648     0.

--- Event Stage 2: Second Solve

           Container    10             20        12480     0.
              Matrix   144            231   1909487728     0.
      Matrix Coarsen     5              5         3360     0.
              Vector   237            323    870608184     0.
           Index Set    88            101       154152     0.
   Star Forest Graph    59             92       117152     0.
       Krylov Solver    12             22       203734     0.
      Preconditioner    12             22        25080     0.
         PetscRandom     5              5         3550     0.
    Distributed Mesh    10             18        92304     0.
     Discrete System    10             18        18432     0.
           Weak Form    10             18        11952     0.
========================================================================================================================
Average time to get PetscTime(): 3.64147e-08
Average time for MPI_Barrier(): 0.00739469
Average time for zero size MPI_Send(): 0.000168472
#PETSc Option Table entries:
-ksp_converged_reason
-ksp_monitor
-ksp_monitor_true_residual
-ksp_type gmres
-log_view
-m 10000
-pc_gamg_use_parallel_coarse_grid_solver
-pc_type gamg
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with 64 bit PetscInt
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8
Configure options: --with-python --prefix=/home/lida -with-mpi-dir=/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4 LDFLAGS="-L/home/lida/lib64 -L/home/lida/lib -L/home/lida/jdk/lib" CPPFLAGS="-I/home/lida/include -I/home/lida/jdk/include -march=native -O3" CXXFLAGS="-I/home/lida/include -I/home/lida/jdk/include -march=native -O3" CFLAGS="-I/home/lida/include -I/home/lida/jdk/include -march=native -O3" --with-debugging=no --with-64-bit-indices FOPTFLAGS="-O3 -march=native" --download-make
-----------------------------------------
Libraries compiled on 2022-05-25 10:03:14 on head1.hpc 
Machine characteristics: Linux-3.10.0-1062.el7.x86_64-x86_64-with-centos-7.7.1908-Core
Using PETSc directory: /home/lida
Using PETSc arch: 
-----------------------------------------

Using C compiler: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpicc -I/home/lida/include -I/home/lida/jdk/include -march=native -O3 -fPIC -I/home/lida/include -I/home/lida/jdk/include -march=native -O3  
Using Fortran compiler: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpif90  -fPIC -Wall -ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -O3 -march=native    -I/home/lida/include -I/home/lida/jdk/include -march=native -O3 
-----------------------------------------

Using include paths: -I/home/lida/include -I/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/include
-----------------------------------------

Using C linker: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpicc
Using Fortran linker: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpif90
Using libraries: -Wl,-rpath,/home/lida/lib -L/home/lida/lib -lpetsc -Wl,-rpath,/home/lida/lib64 -L/home/lida/lib64 -Wl,-rpath,/home/lida/lib -L/home/lida/lib -Wl,-rpath,/home/lida/jdk/lib -L/home/lida/jdk/lib -Wl,-rpath,/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/lib -L/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/lib -Wl,-rpath,/opt/ohpc/pub/compiler/gcc/8.3.0/lib/gcc/x86_64-pc-linux-gnu/8.3.0 -L/opt/ohpc/pub/compiler/gcc/8.3.0/lib/gcc/x86_64-pc-linux-gnu/8.3.0 -Wl,-rpath,/opt/ohpc/pub/compiler/gcc/8.3.0/lib64 -L/opt/ohpc/pub/compiler/gcc/8.3.0/lib64 -Wl,-rpath,/home/lida/intel/oneapi/mkl/2022.0.2/lib/intel64 -L/home/lida/intel/oneapi/mkl/2022.0.2/lib/intel64 -Wl,-rpath,/opt/software/intel/compilers_and_libraries_2020.2.254/linux/tbb/lib/intel64/gcc4.8 -L/opt/software/intel/compilers_and_libraries_2020.2.254/linux/tbb/lib/intel64/gcc4.8 -Wl,-rpath,/opt/ohpc/pub/compiler/gcc/8.3.0/lib -L/opt/ohpc/pub/compiler/gcc/8.3.0/lib -lopenblas -lm -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl
-----------------------------------------

[lida at head1 tutorials]$ 



From bsmith at petsc.dev  Wed Jun  1 13:06:02 2022
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 1 Jun 2022 14:06:02 -0400
Subject: [petsc-users] Mat created by DMStag cannot access ghost points
In-Reply-To: <MEYP282MB3261C82D8A3B37A5CC9402BAEDDF9@MEYP282MB3261.AUSP282.PROD.OUTLOOK.COM>
References: <MEYP282MB3261A6AEAA233ED03745F84AEDDC9@MEYP282MB3261.AUSP282.PROD.OUTLOOK.COM>
	<CAJ98EDpwva_aN9vcvpLwFRc66PxKcm0vfiDP4iwe0tWsFDzsrw@mail.gmail.com>
	<MEYP282MB3261C82D8A3B37A5CC9402BAEDDF9@MEYP282MB3261.AUSP282.PROD.OUTLOOK.COM>
Message-ID: <859FA50E-F3C2-4E54-AEDC-7C8A70D3FCE1@petsc.dev>


  This appears to be a bug in the DMStag/Mat preallocator code. If you add after the DMCreateMatrix() line in your code 

PetscCall(MatSetOption(A, MAT_NO_OFF_PROC_ENTRIES, PETSC_FALSE));

Your code will run correctly.

  Patrick and Matt,

  MatPreallocatorPreallocate_Preallocator() has 

PetscCall(MatSetOption(A, MAT_NO_OFF_PROC_ENTRIES, p->nooffproc));

to make the assembly of the stag matrix from the preallocator matrix a little faster,

but then it never "undoes" this call. Hence the matrix is left in the state where it will error if someone sets values from a different rank (which they certainly can using DMStagMatSetValuesStencil(). 

 I think you need to clear the NO_OFF_PROC at the end of MatPreallocatorPreallocate_Preallocator() because just because the preallocation process never needed communication does not mean that when someone puts real values in the matrix they will never use communication; they can put in values any dang way they please.

I don't know why this bug has not come up before.

  Barry


> On May 31, 2022, at 11:08 PM, Ye Changqing <Ye_Changqing at outlook.com> wrote:
> 
> Dear all,
> 
> [BugReport.c] is a sample code, [BugReportParallel.output] is the output when execute BugReport with mpiexec, [BugReportSerial.output] is the output in serial execution.
> 
> Best,
> Changqing
> 
> ???: Dave May <dave.mayhem23 at gmail.com <mailto:dave.mayhem23 at gmail.com>>
> ????: 2022?5?31? 22:55
> ???: Ye Changqing <Ye_Changqing at outlook.com <mailto:Ye_Changqing at outlook.com>>
> ??: petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>
> ??: Re: [petsc-users] Mat created by DMStag cannot access ghost points
>  
> 
> 
> On Tue 31. May 2022 at 16:28, Ye Changqing <Ye_Changqing at outlook.com <mailto:Ye_Changqing at outlook.com>> wrote:
> Dear developers of PETSc,
> 
> I encountered a problem when using the DMStag module. The program could be executed perfectly in serial, while errors are thrown out in parallel (using mpiexec). Some rows in Mat cannot be accessed in local processes when looping all elements in DMStag. The DM object I used only has one DOF in each element. Hence, I could switch to the DMDA module easily, and the program now is back to normal.
> 
> Some snippets are below.
> 
> Initialise a DMStag object:
> PetscCall(DMStagCreate2d(PETSC_COMM_WORLD, DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, M, N, PETSC_DECIDE, PETSC_DECIDE, 0, 0, 1, DMSTAG_STENCIL_BOX, 1, NULL, NULL, &(s_ctx->dm_P)));
> Created a Mat:
> PetscCall(DMCreateMatrix(s_ctx->dm_P, A));
> Loop:
> PetscCall(DMStagGetCorners(s_ctx->dm_V, &startx, &starty, &startz, &nx, &ny, &nz, &extrax, &extray, &extraz));
> for (ey = starty; ey < starty + ny; ++ey)
> for (ex = startx; ex < startx + nx; ++ex)
> {
> ...
> PetscCall(DMStagMatSetValuesStencil(s_ctx->dm_P, *A, 2, &row[0], 2, &col[0], &val_A[0][0], ADD_VALUES));  // The traceback shows the problem is in here.
> }
> 
> In addition to the code or MWE, please forward us the complete stack trace / error thrown to stdout.
> 
> Thanks,
> Dave
> 
> 
> 
> Best,
> Changqing
> 
> <BugReport.c><BugReportParallel.output><BugReportSerial.output>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220601/db0d9b26/attachment.html>

From bsmith at petsc.dev  Wed Jun  1 13:08:51 2022
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 1 Jun 2022 14:08:51 -0400
Subject: [petsc-users] Sparse linear system solving
In-Reply-To: <201b6c28-2616-edaa-21dd-ee91a257c432@mail.ioffe.ru>
References: <026d55af-c978-81da-f571-46519e5e6f8e@mail.ioffe.ru>
	<CAMYG4Gn=ngVrYCyjgkBgzWO+3eVFhwJ6xhqb_XWhAocbaWM_pA@mail.gmail.com>
	<CADOhEh6uO9-iH6K0Wy-0ZeJv4TwMkYauFfqB-ceiMtP3PgvvHg@mail.gmail.com>
	<2e7ebbf1-511a-7055-ff92-131d3bf73f1e@mail.ioffe.ru>
	<CAMYG4Gk8V6AJdQzjcwX-t_TVf5cSgSs6EQVAZXaQZ8GSsr7iXQ@mail.gmail.com>
	<CAMYG4G=2+46y86OfYccv4UJV991gyOJndaZC_7UuZM-aanDnOA@mail.gmail.com>
	<201b6c28-2616-edaa-21dd-ee91a257c432@mail.ioffe.ru>
Message-ID: <8EE5882E-238C-4FFF-8E51-7AA318B225E8@petsc.dev>


  PETSc is an MPI library. It is not an OpenMP library. Only some external packages that PETSc uses can use OpenMP, things like GAMG will not utilize OpenMP pretty much at all.

  Barry


> On Jun 1, 2022, at 1:37 PM, Lidia <lidia.varsh at mail.ioffe.ru> wrote:
> 
> Dear Matt,
> 
> Thank you for the rule of 10,000 variables per process! We have run ex.5 with matrix 1e4 x 1e4 at our cluster and got a good performance dynamics (see the figure "performance.png" - dependency of the solving time in seconds on the number of cores). We have used GAMG preconditioner (multithread: we have added the option "-pc_gamg_use_parallel_coarse_grid_solver") and GMRES solver. And we have set one openMP thread to every MPI process. Now the ex.5 is working good on many mpi processes! But the running uses about 100 GB of RAM.
> 
> How we can run ex.5 using many openMP threads without mpi? If we just change the running command, the cores are not loaded normally: usually just one core is loaded in 100 % and others are idle. Sometimes all cores are working in 100 % during 1 second but then again become idle about 30 seconds. Can the preconditioner use many threads and how to activate this option?
> 
> The solving times (the time of the solver work) using 60 openMP threads is 511 seconds now, and while using 60 MPI processes - 13.19 seconds.
> 
> ksp_monitor outs for both cases (many openMP threads or many MPI processes) are attached.
> 
> 
> 
> Thank you!
> 
> Best,
> Lidia
> 
> On 31.05.2022 15:21, Matthew Knepley wrote:
>> I have looked at the local logs. First, you have run problems of size 12  and 24. As a rule of thumb, you need 10,000
>> variables per process in order to see good speedup.
>> 
>>   Thanks,
>> 
>>      Matt
>> 
>> On Tue, May 31, 2022 at 8:19 AM Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>> wrote:
>> On Tue, May 31, 2022 at 7:39 AM Lidia <lidia.varsh at mail.ioffe.ru <mailto:lidia.varsh at mail.ioffe.ru>> wrote:
>> Matt, Mark, thank you much for your answers!
>> 
>> 
>> 
>> Now we have run example # 5 on our computer cluster and on the local server and also have not seen any performance increase, but by unclear reason running times on the local server are much better than on the cluster.
>> 
>> I suspect that you are trying to get speedup without increasing the memory bandwidth:
>> 
>>   https://petsc.org/main/faq/#what-kind-of-parallel-computers-or-clusters-are-needed-to-use-petsc-or-why-do-i-get-little-speedup <https://petsc.org/main/faq/#what-kind-of-parallel-computers-or-clusters-are-needed-to-use-petsc-or-why-do-i-get-little-speedup>
>> 
>>   Thanks,
>> 
>>      Matt 
>> Now we will try to run petsc #5 example inside a docker container on our server and see if the problem is in our environment. I'll write you the results of this test as soon as we get it.
>> 
>> The ksp_monitor outs for the 5th test at the current local server configuration (for 2 and 4 mpi processes) and for the cluster (for 1 and 3 mpi processes) are attached .
>> 
>> 
>> 
>> And one more question. Potentially we can use 10 nodes and 96 threads at each node on our cluster. What do you think, which combination of numbers of mpi processes and openmp threads may be the best for the 5th example?
>> 
>> Thank you!
>> 
>> 
>> 
>> Best,
>> Lidiia
>> 
>> On 31.05.2022 05:42, Mark Adams wrote:
>>> And if you see "NO" change in performance I suspect the solver/matrix is all on one processor.
>>> (PETSc does not use threads by default so threads should not change anything).
>>> 
>>> As Matt said, it is best to start with a PETSc example that does something like what you want (parallel linear solve, see src/ksp/ksp/tutorials for examples), and then                         add your code to it.
>>> That way you get the basic infrastructure in place for you, which is pretty obscure to the uninitiated.
>>> 
>>> Mark
>>> 
>>> On Mon, May 30, 2022 at 10:18 PM Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>> wrote:
>>> On Mon, May 30, 2022 at 10:12 PM Lidia <lidia.varsh at mail.ioffe.ru <mailto:lidia.varsh at mail.ioffe.ru>> wrote:
>>> Dear colleagues,
>>> 
>>> Is here anyone who have solved big sparse linear matrices using PETSC?
>>> 
>>> There are lots of publications with this kind of data. Here is one recent one: https://arxiv.org/abs/2204.01722 <https://arxiv.org/abs/2204.01722>
>>>  
>>> We have found NO performance improvement while using more and more mpi 
>>> processes (1-2-3) and open-mp threads (from 1 to 72 threads). Did anyone 
>>> faced to this problem? Does anyone know any possible reasons of such 
>>> behaviour?
>>> 
>>> Solver behavior is dependent on the input matrix. The only general-purpose solvers
>>> are direct, but they do not scale linearly and have high memory requirements.
>>> 
>>> Thus, in order to make progress you will have to be specific about your matrices.
>>>  
>>> We use AMG preconditioner and GMRES solver from KSP package, as our 
>>> matrix is large (from 100 000 to 1e+6 rows and columns), sparse, 
>>> non-symmetric and includes both positive and negative values. But 
>>> performance problems also exist while using CG solvers with symmetric 
>>> matrices.
>>> 
>>> There are many PETSc examples, such as example 5 for the Laplacian, that exhibit
>>> good scaling with both AMG and GMG.
>>>  
>>> Could anyone help us to set appropriate options of the preconditioner 
>>> and solver? Now we use default parameters, maybe they are not the best, 
>>> but we do not know a good combination. Or maybe you could suggest any 
>>> other pairs of preconditioner+solver for such tasks?
>>> 
>>> I can provide more information: the matrices that we solve, c++ script 
>>> to run solving using petsc and any statistics obtained by our runs.
>>> 
>>> First, please provide a description of the linear system, and the output of
>>> 
>>>   -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -log_view
>>> 
>>> for each test case.
>>> 
>>>   Thanks,
>>> 
>>>      Matt
>>>  
>>> Thank you in advance!
>>> 
>>> Best regards,
>>> Lidiia Varshavchik,
>>> Ioffe Institute, St. Petersburg, Russia
>>> 
>>> 
>>> -- 
>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>> -- Norbert Wiener
>>> 
>>> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
>> 
>> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
>> 
>> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
> <performance.png><testOpenMP.txt><testMpi60.txt>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220601/088c94b2/attachment-0001.html>

From knepley at gmail.com  Wed Jun  1 13:14:02 2022
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 1 Jun 2022 14:14:02 -0400
Subject: [petsc-users] Sparse linear system solving
In-Reply-To: <201b6c28-2616-edaa-21dd-ee91a257c432@mail.ioffe.ru>
References: <026d55af-c978-81da-f571-46519e5e6f8e@mail.ioffe.ru>
	<CAMYG4Gn=ngVrYCyjgkBgzWO+3eVFhwJ6xhqb_XWhAocbaWM_pA@mail.gmail.com>
	<CADOhEh6uO9-iH6K0Wy-0ZeJv4TwMkYauFfqB-ceiMtP3PgvvHg@mail.gmail.com>
	<2e7ebbf1-511a-7055-ff92-131d3bf73f1e@mail.ioffe.ru>
	<CAMYG4Gk8V6AJdQzjcwX-t_TVf5cSgSs6EQVAZXaQZ8GSsr7iXQ@mail.gmail.com>
	<CAMYG4G=2+46y86OfYccv4UJV991gyOJndaZC_7UuZM-aanDnOA@mail.gmail.com>
	<201b6c28-2616-edaa-21dd-ee91a257c432@mail.ioffe.ru>
Message-ID: <CAMYG4G=mrfv=sm9Ux5kvKZ9XvoWn4K-Ubm-N3mc3pUfFdQt5_Q@mail.gmail.com>

On Wed, Jun 1, 2022 at 1:43 PM Lidia <lidia.varsh at mail.ioffe.ru> wrote:

> Dear Matt,
>
> Thank you for the rule of 10,000 variables per process! We have run ex.5
> with matrix 1e4 x 1e4 at our cluster and got a good performance dynamics
> (see the figure "performance.png" - dependency of the solving time in
> seconds on the number of cores). We have used GAMG preconditioner
> (multithread: we have added the option "
> -pc_gamg_use_parallel_coarse_grid_solver") and GMRES solver. And we have
> set one openMP thread to every MPI process. Now the ex.5 is working good on
> many mpi processes! But the running uses about 100 GB of RAM.
>
> How we can run ex.5 using many openMP threads without mpi? If we just
> change the running command, the cores are not loaded normally: usually just
> one core is loaded in 100 % and others are idle. Sometimes all cores are
> working in 100 % during 1 second but then again become idle about 30
> seconds. Can the preconditioner use many threads and how to activate this
> option?
>

Maye you could describe what you are trying to accomplish? Threads and
processes are not really different, except for memory sharing. However,
sharing large complex data structures rarely works. That is why they get
partitioned and operate effectively as distributed memory. You would not
really save memory by using
threads in this instance, if that is your goal. This is detailed in the
talks in this session (see 2016 PP Minisymposium on this page
https://cse.buffalo.edu/~knepley/relacs.html).

  Thanks,

     Matt


> The solving times (the time of the solver work) using 60 openMP threads is
> 511 seconds now, and while using 60 MPI processes - 13.19 seconds.
>
> ksp_monitor outs for both cases (many openMP threads or many MPI
> processes) are attached.
>
>
> Thank you!
> Best,
> Lidia
>
> On 31.05.2022 15:21, Matthew Knepley wrote:
>
> I have looked at the local logs. First, you have run problems of size 12
> and 24. As a rule of thumb, you need 10,000
> variables per process in order to see good speedup.
>
>   Thanks,
>
>      Matt
>
> On Tue, May 31, 2022 at 8:19 AM Matthew Knepley <knepley at gmail.com> wrote:
>
>> On Tue, May 31, 2022 at 7:39 AM Lidia <lidia.varsh at mail.ioffe.ru> wrote:
>>
>>> Matt, Mark, thank you much for your answers!
>>>
>>>
>>> Now we have run example # 5 on our computer cluster and on the local
>>> server and also have not seen any performance increase, but by unclear
>>> reason running times on the local server are much better than on the
>>> cluster.
>>>
>> I suspect that you are trying to get speedup without increasing the
>> memory bandwidth:
>>
>>
>> https://petsc.org/main/faq/#what-kind-of-parallel-computers-or-clusters-are-needed-to-use-petsc-or-why-do-i-get-little-speedup
>>
>>   Thanks,
>>
>>      Matt
>>
>>> Now we will try to run petsc #5 example inside a docker container on our
>>> server and see if the problem is in our environment. I'll write you the
>>> results of this test as soon as we get it.
>>>
>>> The ksp_monitor outs for the 5th test at the current local server
>>> configuration (for 2 and 4 mpi processes) and for the cluster (for 1 and 3
>>> mpi processes) are attached .
>>>
>>>
>>> And one more question. Potentially we can use 10 nodes and 96 threads at
>>> each node on our cluster. What do you think, which combination of numbers
>>> of mpi processes and openmp threads may be the best for the 5th example?
>>>
>>> Thank you!
>>>
>>>
>>> Best,
>>> Lidiia
>>>
>>> On 31.05.2022 05:42, Mark Adams wrote:
>>>
>>> And if you see "NO" change in performance I suspect the solver/matrix is
>>> all on one processor.
>>> (PETSc does not use threads by default so threads should not change
>>> anything).
>>>
>>> As Matt said, it is best to start with a PETSc example that does
>>> something like what you want (parallel linear solve, see
>>> src/ksp/ksp/tutorials for examples), and then add your code to it.
>>> That way you get the basic infrastructure in place for you, which is
>>> pretty obscure to the uninitiated.
>>>
>>> Mark
>>>
>>> On Mon, May 30, 2022 at 10:18 PM Matthew Knepley <knepley at gmail.com>
>>> wrote:
>>>
>>>> On Mon, May 30, 2022 at 10:12 PM Lidia <lidia.varsh at mail.ioffe.ru>
>>>> wrote:
>>>>
>>>>> Dear colleagues,
>>>>>
>>>>> Is here anyone who have solved big sparse linear matrices using PETSC?
>>>>>
>>>>
>>>> There are lots of publications with this kind of data. Here is one
>>>> recent one: https://arxiv.org/abs/2204.01722
>>>>
>>>>
>>>>> We have found NO performance improvement while using more and more mpi
>>>>> processes (1-2-3) and open-mp threads (from 1 to 72 threads). Did
>>>>> anyone
>>>>> faced to this problem? Does anyone know any possible reasons of such
>>>>> behaviour?
>>>>>
>>>>
>>>> Solver behavior is dependent on the input matrix. The only
>>>> general-purpose solvers
>>>> are direct, but they do not scale linearly and have high memory
>>>> requirements.
>>>>
>>>> Thus, in order to make progress you will have to be specific about your
>>>> matrices.
>>>>
>>>>
>>>>> We use AMG preconditioner and GMRES solver from KSP package, as our
>>>>> matrix is large (from 100 000 to 1e+6 rows and columns), sparse,
>>>>> non-symmetric and includes both positive and negative values. But
>>>>> performance problems also exist while using CG solvers with symmetric
>>>>> matrices.
>>>>>
>>>>
>>>> There are many PETSc examples, such as example 5 for the Laplacian,
>>>> that exhibit
>>>> good scaling with both AMG and GMG.
>>>>
>>>>
>>>>> Could anyone help us to set appropriate options of the preconditioner
>>>>> and solver? Now we use default parameters, maybe they are not the
>>>>> best,
>>>>> but we do not know a good combination. Or maybe you could suggest any
>>>>> other pairs of preconditioner+solver for such tasks?
>>>>>
>>>>> I can provide more information: the matrices that we solve, c++ script
>>>>> to run solving using petsc and any statistics obtained by our runs.
>>>>>
>>>>
>>>> First, please provide a description of the linear system, and the
>>>> output of
>>>>
>>>>   -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -log_view
>>>>
>>>> for each test case.
>>>>
>>>>   Thanks,
>>>>
>>>>      Matt
>>>>
>>>>
>>>>> Thank you in advance!
>>>>>
>>>>> Best regards,
>>>>> Lidiia Varshavchik,
>>>>> Ioffe Institute, St. Petersburg, Russia
>>>>>
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> https://www.cse.buffalo.edu/~knepley/
>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>
>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220601/f2410cd3/attachment.html>

From badi.hamid at gmail.com  Thu Jun  2 04:38:32 2022
From: badi.hamid at gmail.com (hamid badi)
Date: Thu, 2 Jun 2022 11:38:32 +0200
Subject: [petsc-users] Petsc with mingw64
Message-ID: <CAB3Y+dd6bLCFiFhZEbzO1AP-9m8rQXXyfRw9Vh3bZH-BgEwP6g@mail.gmail.com>

Hi,

I want to compile petsc with openblas & mumps (sequential) under mingw64.
To do so, I compiled openblas and mumps without any problem. But when it
comes to petsc, configure can't find my mumps.

I use the following configuration  options :

--with-shared-libraries=1
--with-openmp=1
--with-mpi=0
--with-debugging=0
--with-scalar-type=real
--with-x=0
--COPTFLAGS=-O3
--CXXOPTFLAGS=-O3
--FOPTFLAGS=-O3
--with-windows-graphics=0
--with-openblas=1
--with-openblas-dir=/mingw64/
--with-mumps=1
--with-mumps-include=~/mumps-git/build/include
--with-mumps-lib="-L~/mumps-git/build/lib -lsmumps -ldmumps -lmumps_common
-lpord -lmpiseq"
--with-mumps-serial=1

i get the following error :

         UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log for
details):
-------------------------------------------------------------------------------
--with-mumps-lib=['-L~/mumps-git/build/lib', '-lsmumps', '-ldmumps',
'-lmumps_common', '-lpord', '-lmpiseq'] and
--with-mumps-include=['~/mumps-git/build/include'] did not work
*******************************************************************************

I also tried using --with-mumps-dir=~/mumps-git/build without success.

Thanks for helping.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220602/1aaec8bd/attachment-0001.html>

From knepley at gmail.com  Thu Jun  2 06:33:41 2022
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 2 Jun 2022 07:33:41 -0400
Subject: [petsc-users] Petsc with mingw64
In-Reply-To: <CAB3Y+dd6bLCFiFhZEbzO1AP-9m8rQXXyfRw9Vh3bZH-BgEwP6g@mail.gmail.com>
References: <CAB3Y+dd6bLCFiFhZEbzO1AP-9m8rQXXyfRw9Vh3bZH-BgEwP6g@mail.gmail.com>
Message-ID: <CAMYG4Gn+mpVxZRKE6yyPqYFxmOJ6Se2vOYfMJbJVLav9Bm=g_A@mail.gmail.com>

For any configure error, you need to send configure.log

  Thanks,

     Matt

On Thu, Jun 2, 2022 at 5:38 AM hamid badi <badi.hamid at gmail.com> wrote:

> Hi,
>
> I want to compile petsc with openblas & mumps (sequential) under mingw64.
> To do so, I compiled openblas and mumps without any problem. But when it
> comes to petsc, configure can't find my mumps.
>
> I use the following configuration  options :
>
> --with-shared-libraries=1
> --with-openmp=1
> --with-mpi=0
> --with-debugging=0
> --with-scalar-type=real
> --with-x=0
> --COPTFLAGS=-O3
> --CXXOPTFLAGS=-O3
> --FOPTFLAGS=-O3
> --with-windows-graphics=0
> --with-openblas=1
> --with-openblas-dir=/mingw64/
> --with-mumps=1
> --with-mumps-include=~/mumps-git/build/include
> --with-mumps-lib="-L~/mumps-git/build/lib -lsmumps -ldmumps -lmumps_common
> -lpord -lmpiseq"
> --with-mumps-serial=1
>
> i get the following error :
>
>          UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log for
> details):
>
> -------------------------------------------------------------------------------
> --with-mumps-lib=['-L~/mumps-git/build/lib', '-lsmumps', '-ldmumps',
> '-lmumps_common', '-lpord', '-lmpiseq'] and
> --with-mumps-include=['~/mumps-git/build/include'] did not work
>
> *******************************************************************************
>
> I also tried using --with-mumps-dir=~/mumps-git/build without success.
>
> Thanks for helping.
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220602/17ef23a6/attachment.html>

From balay at mcs.anl.gov  Thu Jun  2 07:54:31 2022
From: balay at mcs.anl.gov (Satish Balay)
Date: Thu, 2 Jun 2022 07:54:31 -0500 (CDT)
Subject: [petsc-users] Petsc with mingw64
In-Reply-To: <CAB3Y+dd6bLCFiFhZEbzO1AP-9m8rQXXyfRw9Vh3bZH-BgEwP6g@mail.gmail.com>
References: <CAB3Y+dd6bLCFiFhZEbzO1AP-9m8rQXXyfRw9Vh3bZH-BgEwP6g@mail.gmail.com>
Message-ID: <533147c0-ae85-eb97-2d54-c74caaf7fce4@mcs.anl.gov>

On Thu, 2 Jun 2022, hamid badi wrote:

> Hi,
> 
> I want to compile petsc with openblas & mumps (sequential) under mingw64.
> To do so, I compiled openblas and mumps without any problem. But when it
> comes to petsc, configure can't find my mumps.
> 
> I use the following configuration  options :
> 
> --with-shared-libraries=1
> --with-openmp=1
> --with-mpi=0
> --with-debugging=0
> --with-scalar-type=real
> --with-x=0
> --COPTFLAGS=-O3
> --CXXOPTFLAGS=-O3
> --FOPTFLAGS=-O3
> --with-windows-graphics=0
> --with-openblas=1
> --with-openblas-dir=/mingw64/

The option here should be --with-blaslapack-dir  [not --with-openblas=1 --with-openblas-dir=/mingw64/].  But then - the compiler would automatically search this path? If so - avoid specifying this option.

> --with-mumps=1
> --with-mumps-include=~/mumps-git/build/include
> --with-mumps-lib="-L~/mumps-git/build/lib -lsmumps -ldmumps -lmumps_common

use '$HOME' instead of '~'

Satish

> -lpord -lmpiseq"
> --with-mumps-serial=1
> 
> i get the following error :
> 
>          UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log for
> details):
> -------------------------------------------------------------------------------
> --with-mumps-lib=['-L~/mumps-git/build/lib', '-lsmumps', '-ldmumps',
> '-lmumps_common', '-lpord', '-lmpiseq'] and
> --with-mumps-include=['~/mumps-git/build/include'] did not work
> *******************************************************************************
> 
> I also tried using --with-mumps-dir=~/mumps-git/build without success.
> 
> Thanks for helping.
> 


From patrick.sanan at gmail.com  Thu Jun  2 07:59:14 2022
From: patrick.sanan at gmail.com (Patrick Sanan)
Date: Thu, 2 Jun 2022 14:59:14 +0200
Subject: [petsc-users] Mat created by DMStag cannot access ghost points
In-Reply-To: <859FA50E-F3C2-4E54-AEDC-7C8A70D3FCE1@petsc.dev>
References: <MEYP282MB3261A6AEAA233ED03745F84AEDDC9@MEYP282MB3261.AUSP282.PROD.OUTLOOK.COM>
	<CAJ98EDpwva_aN9vcvpLwFRc66PxKcm0vfiDP4iwe0tWsFDzsrw@mail.gmail.com>
	<MEYP282MB3261C82D8A3B37A5CC9402BAEDDF9@MEYP282MB3261.AUSP282.PROD.OUTLOOK.COM>
	<859FA50E-F3C2-4E54-AEDC-7C8A70D3FCE1@petsc.dev>
Message-ID: <CA+z91Td9Fd+1A9QOaR1JeJ=MS1dj9EvDddZMGaPMHKO=sorEQA@mail.gmail.com>

Thanks, Barry and Changqing! That seems reasonable to me, so I'll make an
MR with that change.

Am Mi., 1. Juni 2022 um 20:06 Uhr schrieb Barry Smith <bsmith at petsc.dev>:

>
>   This appears to be a bug in the DMStag/Mat preallocator code. If you add
> after the DMCreateMatrix() line in your code
>
> PetscCall(MatSetOption(A, MAT_NO_OFF_PROC_ENTRIES, PETSC_FALSE));
>
> Your code will run correctly.
>
>   Patrick and Matt,
>
>   MatPreallocatorPreallocate_Preallocator() has
>
> PetscCall(MatSetOption(A, MAT_NO_OFF_PROC_ENTRIES, p->nooffproc));
>
> to make the assembly of the stag matrix from the preallocator matrix a
> little faster,
>
> but then it never "undoes" this call. Hence the matrix is left in the
> state where it will error if someone sets values from a different rank
> (which they certainly can using DMStagMatSetValuesStencil().
>
>  I think you need to clear the NO_OFF_PROC at the end
> of MatPreallocatorPreallocate_Preallocator() because just because the
> preallocation process never needed communication does not mean that when
> someone puts real values in the matrix they will never use communication;
> they can put in values any dang way they please.
>
> I don't know why this bug has not come up before.
>
>   Barry
>
>
> On May 31, 2022, at 11:08 PM, Ye Changqing <Ye_Changqing at outlook.com>
> wrote:
>
> Dear all,
>
> [BugReport.c] is a sample code, [BugReportParallel.output] is the output
> when execute BugReport with mpiexec, [BugReportSerial.output] is the output
> in serial execution.
>
> Best,
> Changqing
>
> ------------------------------
> *???:* Dave May <dave.mayhem23 at gmail.com>
> *????:* 2022?5?31? 22:55
> *???:* Ye Changqing <Ye_Changqing at outlook.com>
> *??:* petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *??:* Re: [petsc-users] Mat created by DMStag cannot access ghost points
>
>
>
> On Tue 31. May 2022 at 16:28, Ye Changqing <Ye_Changqing at outlook.com>
> wrote:
>
> Dear developers of PETSc,
>
> I encountered a problem when using the DMStag module. The program could be
> executed perfectly in serial, while errors are thrown out in parallel
> (using mpiexec). Some rows in Mat cannot be accessed in local processes
> when looping all elements in DMStag. The DM object I used only has one DOF
> in each element. Hence, I could switch to the DMDA module easily, and the
> program now is back to normal.
>
> Some snippets are below.
>
> Initialise a DMStag object:
> PetscCall(DMStagCreate2d(PETSC_COMM_WORLD, DM_BOUNDARY_NONE,
> DM_BOUNDARY_NONE, M, N, PETSC_DECIDE, PETSC_DECIDE, 0, 0, 1,
> DMSTAG_STENCIL_BOX, 1, NULL, NULL, &(s_ctx->dm_P)));
> Created a Mat:
> PetscCall(DMCreateMatrix(s_ctx->dm_P, A));
> Loop:
> PetscCall(DMStagGetCorners(s_ctx->dm_V, &startx, &starty, &startz, &nx,
> &ny, &nz, &extrax, &extray, &extraz));
> for (ey = starty; ey < starty + ny; ++ey)
> for (ex = startx; ex < startx + nx; ++ex)
> {
> ...
> PetscCall(DMStagMatSetValuesStencil(s_ctx->dm_P, *A, 2, &row[0], 2,
> &col[0], &val_A[0][0], ADD_VALUES));  // The traceback shows the problem is
> in here.
> }
>
>
> In addition to the code or MWE, please forward us the complete stack trace
> / error thrown to stdout.
>
> Thanks,
> Dave
>
>
>
> Best,
> Changqing
>
> <BugReport.c><BugReportParallel.output><BugReportSerial.output>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220602/c2d613c3/attachment-0001.html>

From knepley at gmail.com  Thu Jun  2 08:00:50 2022
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 2 Jun 2022 09:00:50 -0400
Subject: [petsc-users] Mat created by DMStag cannot access ghost points
In-Reply-To: <CA+z91Td9Fd+1A9QOaR1JeJ=MS1dj9EvDddZMGaPMHKO=sorEQA@mail.gmail.com>
References: <MEYP282MB3261A6AEAA233ED03745F84AEDDC9@MEYP282MB3261.AUSP282.PROD.OUTLOOK.COM>
	<CAJ98EDpwva_aN9vcvpLwFRc66PxKcm0vfiDP4iwe0tWsFDzsrw@mail.gmail.com>
	<MEYP282MB3261C82D8A3B37A5CC9402BAEDDF9@MEYP282MB3261.AUSP282.PROD.OUTLOOK.COM>
	<859FA50E-F3C2-4E54-AEDC-7C8A70D3FCE1@petsc.dev>
	<CA+z91Td9Fd+1A9QOaR1JeJ=MS1dj9EvDddZMGaPMHKO=sorEQA@mail.gmail.com>
Message-ID: <CAMYG4GkJ1bPSN0XGdOL7+rJfFzQJPO-GKzf7hBAsKVCHSZKGpw@mail.gmail.com>

On Thu, Jun 2, 2022 at 8:59 AM Patrick Sanan <patrick.sanan at gmail.com>
wrote:

> Thanks, Barry and Changqing! That seems reasonable to me, so I'll make an
> MR with that change.
>

Hi Patrick,

In the MR, could you add that option to all places we internally use
Preallocator? I think we mean it for those.

  Thanks,

     Matt


> Am Mi., 1. Juni 2022 um 20:06 Uhr schrieb Barry Smith <bsmith at petsc.dev>:
>
>>
>>   This appears to be a bug in the DMStag/Mat preallocator code. If you
>> add after the DMCreateMatrix() line in your code
>>
>> PetscCall(MatSetOption(A, MAT_NO_OFF_PROC_ENTRIES, PETSC_FALSE));
>>
>> Your code will run correctly.
>>
>>   Patrick and Matt,
>>
>>   MatPreallocatorPreallocate_Preallocator() has
>>
>> PetscCall(MatSetOption(A, MAT_NO_OFF_PROC_ENTRIES, p->nooffproc));
>>
>> to make the assembly of the stag matrix from the preallocator matrix a
>> little faster,
>>
>> but then it never "undoes" this call. Hence the matrix is left in the
>> state where it will error if someone sets values from a different rank
>> (which they certainly can using DMStagMatSetValuesStencil().
>>
>>  I think you need to clear the NO_OFF_PROC at the end
>> of MatPreallocatorPreallocate_Preallocator() because just because the
>> preallocation process never needed communication does not mean that when
>> someone puts real values in the matrix they will never use communication;
>> they can put in values any dang way they please.
>>
>> I don't know why this bug has not come up before.
>>
>>   Barry
>>
>>
>> On May 31, 2022, at 11:08 PM, Ye Changqing <Ye_Changqing at outlook.com>
>> wrote:
>>
>> Dear all,
>>
>> [BugReport.c] is a sample code, [BugReportParallel.output] is the output
>> when execute BugReport with mpiexec, [BugReportSerial.output] is the output
>> in serial execution.
>>
>> Best,
>> Changqing
>>
>> ------------------------------
>> *???:* Dave May <dave.mayhem23 at gmail.com>
>> *????:* 2022?5?31? 22:55
>> *???:* Ye Changqing <Ye_Changqing at outlook.com>
>> *??:* petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
>> *??:* Re: [petsc-users] Mat created by DMStag cannot access ghost points
>>
>>
>>
>> On Tue 31. May 2022 at 16:28, Ye Changqing <Ye_Changqing at outlook.com>
>> wrote:
>>
>> Dear developers of PETSc,
>>
>> I encountered a problem when using the DMStag module. The program could
>> be executed perfectly in serial, while errors are thrown out in parallel
>> (using mpiexec). Some rows in Mat cannot be accessed in local processes
>> when looping all elements in DMStag. The DM object I used only has one DOF
>> in each element. Hence, I could switch to the DMDA module easily, and the
>> program now is back to normal.
>>
>> Some snippets are below.
>>
>> Initialise a DMStag object:
>> PetscCall(DMStagCreate2d(PETSC_COMM_WORLD, DM_BOUNDARY_NONE,
>> DM_BOUNDARY_NONE, M, N, PETSC_DECIDE, PETSC_DECIDE, 0, 0, 1,
>> DMSTAG_STENCIL_BOX, 1, NULL, NULL, &(s_ctx->dm_P)));
>> Created a Mat:
>> PetscCall(DMCreateMatrix(s_ctx->dm_P, A));
>> Loop:
>> PetscCall(DMStagGetCorners(s_ctx->dm_V, &startx, &starty, &startz, &nx,
>> &ny, &nz, &extrax, &extray, &extraz));
>> for (ey = starty; ey < starty + ny; ++ey)
>> for (ex = startx; ex < startx + nx; ++ex)
>> {
>> ...
>> PetscCall(DMStagMatSetValuesStencil(s_ctx->dm_P, *A, 2, &row[0], 2,
>> &col[0], &val_A[0][0], ADD_VALUES));  // The traceback shows the problem is
>> in here.
>> }
>>
>>
>> In addition to the code or MWE, please forward us the complete stack
>> trace / error thrown to stdout.
>>
>> Thanks,
>> Dave
>>
>>
>>
>> Best,
>> Changqing
>>
>> <BugReport.c><BugReportParallel.output><BugReportSerial.output>
>>
>>
>>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220602/fc49517d/attachment.html>

From bsmith at petsc.dev  Thu Jun  2 09:55:17 2022
From: bsmith at petsc.dev (Barry Smith)
Date: Thu, 2 Jun 2022 10:55:17 -0400
Subject: [petsc-users] Petsc with mingw64
In-Reply-To: <CAB3Y+dd6bLCFiFhZEbzO1AP-9m8rQXXyfRw9Vh3bZH-BgEwP6g@mail.gmail.com>
References: <CAB3Y+dd6bLCFiFhZEbzO1AP-9m8rQXXyfRw9Vh3bZH-BgEwP6g@mail.gmail.com>
Message-ID: <B00BBBE9-C092-46FE-8CE3-D5F27E51E005@petsc.dev>


    Configure should error with a very helpful message if --with-openblas or --with-openblas-dir are provided on the command line



> On Jun 2, 2022, at 5:38 AM, hamid badi <badi.hamid at gmail.com> wrote:
> 
> Hi,
> 
> I want to compile petsc with openblas & mumps (sequential) under mingw64. To do so, I compiled openblas and mumps without any problem. But when it comes to petsc, configure can't find my mumps.
> 
> I use the following configuration  options :
> 
> --with-shared-libraries=1
> --with-openmp=1
> --with-mpi=0
> --with-debugging=0
> --with-scalar-type=real
> --with-x=0
> --COPTFLAGS=-O3
> --CXXOPTFLAGS=-O3
> --FOPTFLAGS=-O3
> --with-windows-graphics=0
> --with-openblas=1
> --with-openblas-dir=/mingw64/
> --with-mumps=1
> --with-mumps-include=~/mumps-git/build/include
> --with-mumps-lib="-L~/mumps-git/build/lib -lsmumps -ldmumps -lmumps_common -lpord -lmpiseq"
> --with-mumps-serial=1
> 
> i get the following error :
> 
>          UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log for details):
> -------------------------------------------------------------------------------
> --with-mumps-lib=['-L~/mumps-git/build/lib', '-lsmumps', '-ldmumps', '-lmumps_common', '-lpord', '-lmpiseq'] and
> --with-mumps-include=['~/mumps-git/build/include'] did not work
> *******************************************************************************
> 
> I also tried using --with-mumps-dir=~/mumps-git/build without success.
> 
> Thanks for helping.


From lidia.varsh at mail.ioffe.ru  Fri Jun  3 05:36:32 2022
From: lidia.varsh at mail.ioffe.ru (Lidia)
Date: Fri, 3 Jun 2022 13:36:32 +0300
Subject: [petsc-users] Sparse linear system solving
In-Reply-To: <CAMYG4G=mrfv=sm9Ux5kvKZ9XvoWn4K-Ubm-N3mc3pUfFdQt5_Q@mail.gmail.com>
References: <026d55af-c978-81da-f571-46519e5e6f8e@mail.ioffe.ru>
	<CAMYG4Gn=ngVrYCyjgkBgzWO+3eVFhwJ6xhqb_XWhAocbaWM_pA@mail.gmail.com>
	<CADOhEh6uO9-iH6K0Wy-0ZeJv4TwMkYauFfqB-ceiMtP3PgvvHg@mail.gmail.com>
	<2e7ebbf1-511a-7055-ff92-131d3bf73f1e@mail.ioffe.ru>
	<CAMYG4Gk8V6AJdQzjcwX-t_TVf5cSgSs6EQVAZXaQZ8GSsr7iXQ@mail.gmail.com>
	<CAMYG4G=2+46y86OfYccv4UJV991gyOJndaZC_7UuZM-aanDnOA@mail.gmail.com>
	<201b6c28-2616-edaa-21dd-ee91a257c432@mail.ioffe.ru>
	<CAMYG4G=mrfv=sm9Ux5kvKZ9XvoWn4K-Ubm-N3mc3pUfFdQt5_Q@mail.gmail.com>
Message-ID: <5475809d-16b1-e6a8-72b8-c4605f321e8e@mail.ioffe.ru>

Dear Matt, Barry,

thank you for the information about openMP!

Now all processes are loaded well. But we see a strange behaviour of 
running times at different iterations, see description below. Could you 
please explain us the reason and how we can improve it?

We need to quickly solve a big (about 1e6 rows) square sparse 
non-symmetric matrix many times (about 1e5 times) consequently. Matrix 
is constant at every iteration, and the right-side vector B is slowly 
changed (we think that its change at every iteration should be less then 
0.001 %). So we use every previous solution vector X as an initial guess 
for the next iteration. AMG preconditioner and GMRES solver are used.

We have tested the code using a matrix with 631 000 rows, during 15 
consequent iterations, using vector X from the previous iterations. 
Right-side vector B and matrix A are constant during the whole running. 
The time of the first iteration is large (about 2 seconds) and is 
quickly decreased to the next iterations (average time of last 
iterations were about 0.00008 s). But some iterations in the middle (# 2 
and # 12) have huge time - 0.999063 second (see the figure with time 
dynamics attached). This time of 0.999 second does not depend on the 
size of a matrix, on the number of MPI processes, these time jumps also 
exist if we vary vector B. Why these time jumps appear and how we can 
avoid them?

The ksp_monitor out for this running (included 15 iterations) using 36 
MPI processes and a file with the memory bandwidth information 
(testSpeed) are also attached. We can provide our C++ script if it is 
needed.

Thanks a lot!

Best,
Lidiia



On 01.06.2022 21:14, Matthew Knepley wrote:
> On Wed, Jun 1, 2022 at 1:43 PM Lidia <lidia.varsh at mail.ioffe.ru> wrote:
>
>     Dear Matt,
>
>     Thank you for the rule of 10,000 variables per process! We have
>     run ex.5 with matrix 1e4 x 1e4 at our cluster and got a good
>     performance dynamics (see the figure "performance.png" -
>     dependency of the solving time in seconds on the number of cores).
>     We have used GAMG preconditioner (multithread: we have added the
>     option "-pc_gamg_use_parallel_coarse_grid_solver") and GMRES
>     solver. And we have set one openMP thread to every MPI process.
>     Now the ex.5 is working good on many mpi processes! But the
>     running uses about 100 GB of RAM.
>
>     How we can run ex.5 using many openMP threads without mpi? If we
>     just change the running command, the cores are not loaded
>     normally: usually just one core is loaded in 100 % and others are
>     idle. Sometimes all cores are working in 100 % during 1 second but
>     then again become idle about 30 seconds. Can the preconditioner
>     use many threads and how to activate this option?
>
>
> Maye you could describe what you are trying to accomplish? Threads and 
> processes are not really different, except for memory sharing. 
> However, sharing large complex data structures rarely works. That is 
> why they get partitioned and operate effectively as distributed 
> memory. You would not really save memory by using
> threads in this instance, if that is your goal. This is detailed in 
> the talks in this session (see 2016 PP Minisymposium on this page 
> https://cse.buffalo.edu/~knepley/relacs.html).
>
> ? Thanks,
>
> ? ? ?Matt
>
>     The solving times (the time of the solver work) using 60 openMP
>     threads is 511 seconds now, and while using 60 MPI processes -
>     13.19 seconds.
>
>     ksp_monitor outs for both cases (many openMP threads or many MPI
>     processes) are attached.
>
>
>     Thank you!
>
>     Best,
>     Lidia
>
>     On 31.05.2022 15:21, Matthew Knepley wrote:
>>     I have looked at the local logs. First, you have run problems of
>>     size 12? and 24. As a rule of thumb, you need 10,000
>>     variables per process in order to see good speedup.
>>
>>     ? Thanks,
>>
>>     ? ? ?Matt
>>
>>     On Tue, May 31, 2022 at 8:19 AM Matthew Knepley
>>     <knepley at gmail.com> wrote:
>>
>>         On Tue, May 31, 2022 at 7:39 AM Lidia
>>         <lidia.varsh at mail.ioffe.ru> wrote:
>>
>>             Matt, Mark, thank you much for your answers!
>>
>>
>>             Now we have run example # 5 on our computer cluster and
>>             on the local server and also have not seen any
>>             performance increase, but by unclear reason running times
>>             on the local server are much better than on the cluster.
>>
>>         I suspect that you are trying to get speedup without
>>         increasing the memory bandwidth:
>>
>>         https://petsc.org/main/faq/#what-kind-of-parallel-computers-or-clusters-are-needed-to-use-petsc-or-why-do-i-get-little-speedup
>>
>>         ? Thanks,
>>
>>         ? ? ?Matt
>>
>>             Now we will try to run petsc #5 example inside a docker
>>             container on our server and see if the problem is in our
>>             environment. I'll write you the results of this test as
>>             soon as we get it.
>>
>>             The ksp_monitor outs for the 5th test at the current
>>             local server configuration (for 2 and 4 mpi processes)
>>             and for the cluster (for 1 and 3 mpi processes) are
>>             attached .
>>
>>
>>             And one more question. Potentially we can use 10 nodes
>>             and 96 threads at each node on our cluster. What do you
>>             think, which combination of numbers of mpi processes and
>>             openmp threads may be the best for the 5th example?
>>
>>             Thank you!
>>
>>
>>             Best,
>>             Lidiia
>>
>>             On 31.05.2022 05:42, Mark Adams wrote:
>>>             And if you see "NO" change in performance I suspect the
>>>             solver/matrix is all on one processor.
>>>             (PETSc does not use threads by default so threads should
>>>             not change anything).
>>>
>>>             As Matt said, it is best to start with a PETSc
>>>             example?that does something like what you want (parallel
>>>             linear solve, see src/ksp/ksp/tutorials for examples),
>>>             and then add your code to it.
>>>             That way you get the basic infrastructure?in place for
>>>             you, which is pretty obscure to the uninitiated.
>>>
>>>             Mark
>>>
>>>             On Mon, May 30, 2022 at 10:18 PM Matthew Knepley
>>>             <knepley at gmail.com> wrote:
>>>
>>>                 On Mon, May 30, 2022 at 10:12 PM Lidia
>>>                 <lidia.varsh at mail.ioffe.ru> wrote:
>>>
>>>                     Dear colleagues,
>>>
>>>                     Is here anyone who have solved big sparse linear
>>>                     matrices using PETSC?
>>>
>>>
>>>                 There are lots of publications with this kind of
>>>                 data. Here is one recent one:
>>>                 https://arxiv.org/abs/2204.01722
>>>
>>>                     We have found NO performance improvement while
>>>                     using more and more mpi
>>>                     processes (1-2-3) and open-mp threads (from 1 to
>>>                     72 threads). Did anyone
>>>                     faced to this problem? Does anyone know any
>>>                     possible reasons of such
>>>                     behaviour?
>>>
>>>
>>>                 Solver behavior is dependent on the input matrix.
>>>                 The only general-purpose solvers
>>>                 are direct, but they do not scale linearly and have
>>>                 high memory requirements.
>>>
>>>                 Thus, in order to make progress you will have to be
>>>                 specific about your matrices.
>>>
>>>                     We use AMG preconditioner and GMRES solver from
>>>                     KSP package, as our
>>>                     matrix is large (from 100 000 to 1e+6 rows and
>>>                     columns), sparse,
>>>                     non-symmetric and includes both positive and
>>>                     negative values. But
>>>                     performance problems also exist while using CG
>>>                     solvers with symmetric
>>>                     matrices.
>>>
>>>
>>>                 There are many PETSc examples, such as example 5 for
>>>                 the Laplacian, that exhibit
>>>                 good scaling with both AMG and GMG.
>>>
>>>                     Could anyone help us to set appropriate options
>>>                     of the preconditioner
>>>                     and solver? Now we use default parameters, maybe
>>>                     they are not the best,
>>>                     but we do not know a good combination. Or maybe
>>>                     you could suggest any
>>>                     other pairs of preconditioner+solver for such tasks?
>>>
>>>                     I can provide more information: the matrices
>>>                     that we solve, c++ script
>>>                     to run solving using petsc and any statistics
>>>                     obtained by our runs.
>>>
>>>
>>>                 First, please provide a description of the linear
>>>                 system, and the output of
>>>
>>>                 ? -ksp_view -ksp_monitor_true_residual
>>>                 -ksp_converged_reason -log_view
>>>
>>>                 for each test case.
>>>
>>>                 ? Thanks,
>>>
>>>                 ? ? ?Matt
>>>
>>>                     Thank you in advance!
>>>
>>>                     Best regards,
>>>                     Lidiia Varshavchik,
>>>                     Ioffe Institute, St. Petersburg, Russia
>>>
>>>
>>>
>>>                 -- 
>>>                 What most experimenters take for granted before they
>>>                 begin their experiments is infinitely more
>>>                 interesting than any results to which their
>>>                 experiments lead.
>>>                 -- Norbert Wiener
>>>
>>>                 https://www.cse.buffalo.edu/~knepley/
>>>                 <http://www.cse.buffalo.edu/~knepley/>
>>>
>>
>>
>>         -- 
>>         What most experimenters take for granted before they begin
>>         their experiments is infinitely more interesting than any
>>         results to which their experiments lead.
>>         -- Norbert Wiener
>>
>>         https://www.cse.buffalo.edu/~knepley/
>>         <http://www.cse.buffalo.edu/~knepley/>
>>
>>
>>
>>     -- 
>>     What most experimenters take for granted before they begin their
>>     experiments is infinitely more interesting than any results to
>>     which their experiments lead.
>>     -- Norbert Wiener
>>
>>     https://www.cse.buffalo.edu/~knepley/
>>     <http://www.cse.buffalo.edu/~knepley/>
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/ 
> <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220603/c25582f5/attachment-0001.html>
-------------- next part --------------
[lida at head1 build]$ mpirun -n 36 ./petscTest -ksp_monitor -ksp_monitor_true_residual -ksp_converged_reason -log_view
--------------------------------------------------------------------------
WARNING: No preset parameters were found for the device that Open MPI
detected:

  Local host:            head1
  Device name:           i40iw0
  Device vendor ID:      0x8086
  Device vendor part ID: 14290

Default device parameters will be used, which may result in lower
performance.  You can edit any of the files specified by the
btl_openib_device_param_files MCA parameter to set values for your
device.

NOTE: You can turn off this warning by setting the MCA parameter
      btl_openib_warn_no_device_params_found to 0.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
No OpenFabrics connection schemes reported that they were able to be
used on a specific port.  As such, the openib BTL (OpenFabrics
support) will be disabled for this port.

  Local host:           head1
  Local device:         i40iw0
  Local port:           1
  CPCs attempted:       rdmacm, udcm
--------------------------------------------------------------------------
Mat size 630834
using block size is 1
5 17524 87620 105144
1 17524 17524 35048
7 17523 122667 140190
21 17523 367989 385512
27 17523 473127 490650
31 17523 543219 560742
2 17524 35048 52572
3 17524 52572 70096
4 17524 70096 87620
0 17524 0 17524
6 17523 105144 122667
8 17523 140190 157713
9 17523 157713 175236
11 17523 192759 210282
12 17523 210282 227805
13 17523 227805 245328
14 17523 245328 262851
20 17523 350466 367989
22 17523 385512 403035
23 17523 403035 420558
25 17523 438081 455604
26 17523 455604 473127
28 17523 490650 508173
30 17523 525696 543219
33 17523 578265 595788
34 17523 595788 613311
35 17523 613311 630834
15 17523 262851 280374
16 17523 280374 297897
17 17523 297897 315420
18 17523 315420 332943
19 17523 332943 350466
24 17523 420558 438081
29 17523 508173 525696
10 17523 175236 192759
32 17523 560742 578265
[head1.hpc:242461] 71 more processes have sent help message help-mpi-btl-openib.txt / no device params found
[head1.hpc:242461] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
[head1.hpc:242461] 71 more processes have sent help message help-mpi-btl-openib-cpc-base.txt / no cpcs for port
Compute with tolerance 0.000010000000000000000818030539 solver is gmres

startPC
startSolv
  0 KSP Residual norm 1.868353493329e+08 
  0 KSP preconditioned resid norm 1.868353493329e+08 true resid norm 2.165031654579e+06 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP Residual norm 1.132315559206e+08 
  1 KSP preconditioned resid norm 1.132315559206e+08 true resid norm 6.461246152989e+07 ||r(i)||/||b|| 2.984365673971e+01
  2 KSP Residual norm 1.534820972084e+07 
  2 KSP preconditioned resid norm 1.534820972084e+07 true resid norm 2.426876823961e+07 ||r(i)||/||b|| 1.120942882672e+01
  3 KSP Residual norm 7.539322505186e+06 
  3 KSP preconditioned resid norm 7.539322505186e+06 true resid norm 1.829739078019e+07 ||r(i)||/||b|| 8.451327139485e+00
  4 KSP Residual norm 4.660669278808e+06 
  4 KSP preconditioned resid norm 4.660669278808e+06 true resid norm 1.744671242073e+07 ||r(i)||/||b|| 8.058409854574e+00
  5 KSP Residual norm 3.223391594815e+06 
  5 KSP preconditioned resid norm 3.223391594815e+06 true resid norm 1.737561446785e+07 ||r(i)||/||b|| 8.025570633618e+00
  6 KSP Residual norm 2.240424900880e+06 
  6 KSP preconditioned resid norm 2.240424900880e+06 true resid norm 1.683362112781e+07 ||r(i)||/||b|| 7.775230949719e+00
  7 KSP Residual norm 1.623399472779e+06 
  7 KSP preconditioned resid norm 1.623399472779e+06 true resid norm 1.624000914301e+07 ||r(i)||/||b|| 7.501049284271e+00
  8 KSP Residual norm 1.211518107569e+06 
  8 KSP preconditioned resid norm 1.211518107569e+06 true resid norm 1.558830757667e+07 ||r(i)||/||b|| 7.200036795627e+00
  9 KSP Residual norm 9.642201969240e+05 
  9 KSP preconditioned resid norm 9.642201969240e+05 true resid norm 1.486473650844e+07 ||r(i)||/||b|| 6.865828717562e+00
 10 KSP Residual norm 7.867651557046e+05 
 10 KSP preconditioned resid norm 7.867651557046e+05 true resid norm 1.396084153269e+07 ||r(i)||/||b|| 6.448331368812e+00
 11 KSP Residual norm 7.078405789961e+05 
 11 KSP preconditioned resid norm 7.078405789961e+05 true resid norm 1.296873719329e+07 ||r(i)||/||b|| 5.990091260724e+00
 12 KSP Residual norm 6.335098563709e+05 
 12 KSP preconditioned resid norm 6.335098563709e+05 true resid norm 1.164201582227e+07 ||r(i)||/||b|| 5.377295892022e+00
 13 KSP Residual norm 5.397665070507e+05 
 13 KSP preconditioned resid norm 5.397665070507e+05 true resid norm 1.042661489959e+07 ||r(i)||/||b|| 4.815917992485e+00
 14 KSP Residual norm 4.549629296863e+05 
 14 KSP preconditioned resid norm 4.549629296863e+05 true resid norm 9.420542232153e+06 ||r(i)||/||b|| 4.351226095114e+00
 15 KSP Residual norm 3.627838605442e+05 
 15 KSP preconditioned resid norm 3.627838605442e+05 true resid norm 8.546289749804e+06 ||r(i)||/||b|| 3.947420229042e+00
 16 KSP Residual norm 2.974632184520e+05 
 16 KSP preconditioned resid norm 2.974632184520e+05 true resid norm 7.707507230485e+06 ||r(i)||/||b|| 3.559997478181e+00
 17 KSP Residual norm 2.584437744774e+05 
 17 KSP preconditioned resid norm 2.584437744774e+05 true resid norm 6.996748201244e+06 ||r(i)||/||b|| 3.231707114510e+00
 18 KSP Residual norm 2.172287358399e+05 
 18 KSP preconditioned resid norm 2.172287358399e+05 true resid norm 6.008578157843e+06 ||r(i)||/||b|| 2.775284206646e+00
 19 KSP Residual norm 1.807320553225e+05 
 19 KSP preconditioned resid norm 1.807320553225e+05 true resid norm 5.166440962968e+06 ||r(i)||/||b|| 2.386311974719e+00
 20 KSP Residual norm 1.583700438237e+05 
 20 KSP preconditioned resid norm 1.583700438237e+05 true resid norm 4.613820989743e+06 ||r(i)||/||b|| 2.131063986978e+00
 21 KSP Residual norm 1.413879944302e+05 
 21 KSP preconditioned resid norm 1.413879944302e+05 true resid norm 4.151504476178e+06 ||r(i)||/||b|| 1.917525994318e+00
 22 KSP Residual norm 1.228172205521e+05 
 22 KSP preconditioned resid norm 1.228172205521e+05 true resid norm 3.630290527838e+06 ||r(i)||/||b|| 1.676784041545e+00
 23 KSP Residual norm 1.084793002546e+05 
 23 KSP preconditioned resid norm 1.084793002546e+05 true resid norm 3.185566371074e+06 ||r(i)||/||b|| 1.471371729986e+00
 24 KSP Residual norm 9.520569914833e+04 
 24 KSP preconditioned resid norm 9.520569914833e+04 true resid norm 2.811378949429e+06 ||r(i)||/||b|| 1.298539420189e+00
 25 KSP Residual norm 8.331027569193e+04 
 25 KSP preconditioned resid norm 8.331027569193e+04 true resid norm 2.487128345424e+06 ||r(i)||/||b|| 1.148772277839e+00
 26 KSP Residual norm 7.116546817077e+04 
 26 KSP preconditioned resid norm 7.116546817077e+04 true resid norm 2.128784852233e+06 ||r(i)||/||b|| 9.832580728002e-01
 27 KSP Residual norm 6.107201042673e+04 
 27 KSP preconditioned resid norm 6.107201042673e+04 true resid norm 1.816742057822e+06 ||r(i)||/||b|| 8.391295591358e-01
 28 KSP Residual norm 5.407959454186e+04 
 28 KSP preconditioned resid norm 5.407959454186e+04 true resid norm 1.590698721931e+06 ||r(i)||/||b|| 7.347230783285e-01
 29 KSP Residual norm 4.859208455279e+04 
 29 KSP preconditioned resid norm 4.859208455279e+04 true resid norm 1.405619902078e+06 ||r(i)||/||b|| 6.492375753974e-01
 30 KSP Residual norm 4.463327440008e+04 
 30 KSP preconditioned resid norm 4.463327440008e+04 true resid norm 1.258789113490e+06 ||r(i)||/||b|| 5.814183413104e-01
 31 KSP Residual norm 3.927742507325e+04 
 31 KSP preconditioned resid norm 3.927742507325e+04 true resid norm 1.086402490838e+06 ||r(i)||/||b|| 5.017951994097e-01
 32 KSP Residual norm 3.417683630748e+04 
 32 KSP preconditioned resid norm 3.417683630748e+04 true resid norm 9.566603594382e+05 ||r(i)||/||b|| 4.418689941159e-01
 33 KSP Residual norm 3.002775921838e+04 
 33 KSP preconditioned resid norm 3.002775921838e+04 true resid norm 8.429546731968e+05 ||r(i)||/||b|| 3.893498145460e-01
 34 KSP Residual norm 2.622152046131e+04 
 34 KSP preconditioned resid norm 2.622152046131e+04 true resid norm 7.578781071384e+05 ||r(i)||/||b|| 3.500540537296e-01
 35 KSP Residual norm 2.264910466846e+04 
 35 KSP preconditioned resid norm 2.264910466846e+04 true resid norm 6.684892523160e+05 ||r(i)||/||b|| 3.087665027447e-01
 36 KSP Residual norm 1.970721593805e+04 
 36 KSP preconditioned resid norm 1.970721593805e+04 true resid norm 5.905536805578e+05 ||r(i)||/||b|| 2.727690744422e-01
 37 KSP Residual norm 1.666104858674e+04 
 37 KSP preconditioned resid norm 1.666104858674e+04 true resid norm 5.172223947409e+05 ||r(i)||/||b|| 2.388983060118e-01
 38 KSP Residual norm 1.432004409785e+04 
 38 KSP preconditioned resid norm 1.432004409785e+04 true resid norm 4.593351142808e+05 ||r(i)||/||b|| 2.121609230559e-01
 39 KSP Residual norm 1.211549914084e+04 
 39 KSP preconditioned resid norm 1.211549914084e+04 true resid norm 4.019170298644e+05 ||r(i)||/||b|| 1.856402556583e-01
 40 KSP Residual norm 1.061599294842e+04 
 40 KSP preconditioned resid norm 1.061599294842e+04 true resid norm 3.586589723898e+05 ||r(i)||/||b|| 1.656599207828e-01
 41 KSP Residual norm 9.577489574913e+03 
 41 KSP preconditioned resid norm 9.577489574913e+03 true resid norm 3.221505690964e+05 ||r(i)||/||b|| 1.487971635034e-01
 42 KSP Residual norm 8.221576307371e+03 
 42 KSP preconditioned resid norm 8.221576307371e+03 true resid norm 2.745213067979e+05 ||r(i)||/||b|| 1.267978258965e-01
 43 KSP Residual norm 6.898384710028e+03 
 43 KSP preconditioned resid norm 6.898384710028e+03 true resid norm 2.330710645170e+05 ||r(i)||/||b|| 1.076524973776e-01
 44 KSP Residual norm 6.087330352788e+03 
 44 KSP preconditioned resid norm 6.087330352788e+03 true resid norm 2.058183089407e+05 ||r(i)||/||b|| 9.506480355857e-02
 45 KSP Residual norm 5.207144067562e+03 
 45 KSP preconditioned resid norm 5.207144067562e+03 true resid norm 1.745194864065e+05 ||r(i)||/||b|| 8.060828396546e-02
 46 KSP Residual norm 4.556037825199e+03 
 46 KSP preconditioned resid norm 4.556037825199e+03 true resid norm 1.551715592432e+05 ||r(i)||/||b|| 7.167172771584e-02
 47 KSP Residual norm 3.856329202278e+03 
 47 KSP preconditioned resid norm 3.856329202278e+03 true resid norm 1.315660202980e+05 ||r(i)||/||b|| 6.076863588562e-02
 48 KSP Residual norm 3.361878313389e+03 
 48 KSP preconditioned resid norm 3.361878313389e+03 true resid norm 1.147746368397e+05 ||r(i)||/||b|| 5.301291396685e-02
 49 KSP Residual norm 2.894852363045e+03 
 49 KSP preconditioned resid norm 2.894852363045e+03 true resid norm 9.951811967458e+04 ||r(i)||/||b|| 4.596612685273e-02
 50 KSP Residual norm 2.576639763678e+03 
 50 KSP preconditioned resid norm 2.576639763678e+03 true resid norm 8.828512403741e+04 ||r(i)||/||b|| 4.077775207151e-02
 51 KSP Residual norm 2.176356645511e+03 
 51 KSP preconditioned resid norm 2.176356645511e+03 true resid norm 7.535533182060e+04 ||r(i)||/||b|| 3.480564898957e-02
 52 KSP Residual norm 1.909590120581e+03 
 52 KSP preconditioned resid norm 1.909590120581e+03 true resid norm 6.643741378378e+04 ||r(i)||/||b|| 3.068657848177e-02
 53 KSP Residual norm 1.625794696835e+03 
 53 KSP preconditioned resid norm 1.625794696835e+03 true resid norm 5.591842695771e+04 ||r(i)||/||b|| 2.582799509625e-02
Linear solve converged due to CONVERGED_RTOL iterations 53

#################################################################################
SOLV gmres iter 0
Relative error is 1.009408197(min 1.000000000, max 2.386602308), tril rel error 0.138442009(min 0.000000070, max 1.000000000)   (converged reason is CONVERGED_RTOL) iterations 53 time 2.000408 s(2000407820.91200017929077148438 us)
#################################################################################


startPC
startSolv
  0 KSP Residual norm 1.625794716222e+03 
  0 KSP preconditioned resid norm 1.625794716222e+03 true resid norm 5.591842695771e+04 ||r(i)||/||b|| 2.582799509625e-02
Linear solve converged due to CONVERGED_RTOL iterations 0

#################################################################################
SOLV gmres iter 1
Relative error is 1.009408197(min 1.000000000, max 2.386602308), tril rel error 0.138442009(min 0.000000070, max 1.000000000)   (converged reason is CONVERGED_RTOL) iterations 0 time 0.000082 s(82206.13200000001234002411 us)
#################################################################################


startPC
startSolv
  0 KSP Residual norm 1.625794716222e+03 
  0 KSP preconditioned resid norm 1.625794716222e+03 true resid norm 5.591842695771e+04 ||r(i)||/||b|| 2.582799509625e-02
Linear solve converged due to CONVERGED_RTOL iterations 0

#################################################################################
SOLV gmres iter 2
Relative error is 1.009408197(min 1.000000000, max 2.386602308), tril rel error 0.138442009(min 0.000000070, max 1.000000000)   (converged reason is CONVERGED_RTOL) iterations 0 time 0.999076 s(999076088.56700003147125244141 us)
#################################################################################


startPC
startSolv
  0 KSP Residual norm 1.625794716222e+03 
  0 KSP preconditioned resid norm 1.625794716222e+03 true resid norm 5.591842695771e+04 ||r(i)||/||b|| 2.582799509625e-02
Linear solve converged due to CONVERGED_RTOL iterations 0

#################################################################################
SOLV gmres iter 3
Relative error is 1.009408197(min 1.000000000, max 2.386602308), tril rel error 0.138442009(min 0.000000070, max 1.000000000)   (converged reason is CONVERGED_RTOL) iterations 0 time 0.000081 s(80689.84000000001105945557 us)
#################################################################################


startPC
startSolv
  0 KSP Residual norm 1.625794716222e+03 
  0 KSP preconditioned resid norm 1.625794716222e+03 true resid norm 5.591842695771e+04 ||r(i)||/||b|| 2.582799509625e-02
Linear solve converged due to CONVERGED_RTOL iterations 0

#################################################################################
SOLV gmres iter 4
Relative error is 1.009408197(min 1.000000000, max 2.386602308), tril rel error 0.138442009(min 0.000000070, max 1.000000000)   (converged reason is CONVERGED_RTOL) iterations 0 time 0.000079 s(79139.94299999999930150807 us)
#################################################################################


startPC
startSolv
  0 KSP Residual norm 1.625794716222e+03 
  0 KSP preconditioned resid norm 1.625794716222e+03 true resid norm 5.591842695771e+04 ||r(i)||/||b|| 2.582799509625e-02
Linear solve converged due to CONVERGED_RTOL iterations 0

#################################################################################
SOLV gmres iter 5
Relative error is 1.009408197(min 1.000000000, max 2.386602308), tril rel error 0.138442009(min 0.000000070, max 1.000000000)   (converged reason is CONVERGED_RTOL) iterations 0 time 0.000065 s(65399.49300000000948784873 us)
#################################################################################


startPC
startSolv
  0 KSP Residual norm 1.625794716222e+03 
  0 KSP preconditioned resid norm 1.625794716222e+03 true resid norm 5.591842695771e+04 ||r(i)||/||b|| 2.582799509625e-02
Linear solve converged due to CONVERGED_RTOL iterations 0

#################################################################################
SOLV gmres iter 6
Relative error is 1.009408197(min 1.000000000, max 2.386602308), tril rel error 0.138442009(min 0.000000070, max 1.000000000)   (converged reason is CONVERGED_RTOL) iterations 0 time 0.000080 s(79554.38999999999941792339 us)
#################################################################################


startPC
startSolv
  0 KSP Residual norm 1.625794716222e+03 
  0 KSP preconditioned resid norm 1.625794716222e+03 true resid norm 5.591842695771e+04 ||r(i)||/||b|| 2.582799509625e-02
Linear solve converged due to CONVERGED_RTOL iterations 0

#################################################################################
SOLV gmres iter 7
Relative error is 1.009408197(min 1.000000000, max 2.386602308), tril rel error 0.138442009(min 0.000000070, max 1.000000000)   (converged reason is CONVERGED_RTOL) iterations 0 time 0.000080 s(80431.21900000001187436283 us)
#################################################################################


startPC
startSolv
  0 KSP Residual norm 1.625794716222e+03 
  0 KSP preconditioned resid norm 1.625794716222e+03 true resid norm 5.591842695771e+04 ||r(i)||/||b|| 2.582799509625e-02
Linear solve converged due to CONVERGED_RTOL iterations 0

#################################################################################
SOLV gmres iter 8
Relative error is 1.009408197(min 1.000000000, max 2.386602308), tril rel error 0.138442009(min 0.000000070, max 1.000000000)   (converged reason is CONVERGED_RTOL) iterations 0 time 0.000080 s(80255.19100000000617001206 us)
#################################################################################


startPC
startSolv
  0 KSP Residual norm 1.625794716222e+03 
  0 KSP preconditioned resid norm 1.625794716222e+03 true resid norm 5.591842695771e+04 ||r(i)||/||b|| 2.582799509625e-02
Linear solve converged due to CONVERGED_RTOL iterations 0

#################################################################################
SOLV gmres iter 9
Relative error is 1.009408197(min 1.000000000, max 2.386602308), tril rel error 0.138442009(min 0.000000070, max 1.000000000)   (converged reason is CONVERGED_RTOL) iterations 0 time 0.000081 s(80568.19700000000011641532 us)
#################################################################################


startPC
startSolv
  0 KSP Residual norm 1.625794716222e+03 
  0 KSP preconditioned resid norm 1.625794716222e+03 true resid norm 5.591842695771e+04 ||r(i)||/||b|| 2.582799509625e-02
Linear solve converged due to CONVERGED_RTOL iterations 0

#################################################################################
SOLV gmres iter 10
Relative error is 1.009408197(min 1.000000000, max 2.386602308), tril rel error 0.138442009(min 0.000000070, max 1.000000000)   (converged reason is CONVERGED_RTOL) iterations 0 time 0.000078 s(78323.06299999999464489520 us)
#################################################################################


startPC
startSolv
  0 KSP Residual norm 1.625794716222e+03 
  0 KSP preconditioned resid norm 1.625794716222e+03 true resid norm 5.591842695771e+04 ||r(i)||/||b|| 2.582799509625e-02
Linear solve converged due to CONVERGED_RTOL iterations 0

#################################################################################
SOLV gmres iter 11
Relative error is 1.009408197(min 1.000000000, max 2.386602308), tril rel error 0.138442009(min 0.000000070, max 1.000000000)   (converged reason is CONVERGED_RTOL) iterations 0 time 0.000072 s(71933.38600000001315493137 us)
#################################################################################


startPC
startSolv
  0 KSP Residual norm 1.625794716222e+03 
  0 KSP preconditioned resid norm 1.625794716222e+03 true resid norm 5.591842695771e+04 ||r(i)||/||b|| 2.582799509625e-02
Linear solve converged due to CONVERGED_RTOL iterations 0

#################################################################################
SOLV gmres iter 12
Relative error is 1.009408197(min 1.000000000, max 2.386602308), tril rel error 0.138442009(min 0.000000070, max 1.000000000)   (converged reason is CONVERGED_RTOL) iterations 0 time 0.999063 s(999063438.25300002098083496094 us)
#################################################################################


startPC
startSolv
  0 KSP Residual norm 1.625794716222e+03 
  0 KSP preconditioned resid norm 1.625794716222e+03 true resid norm 5.591842695771e+04 ||r(i)||/||b|| 2.582799509625e-02
Linear solve converged due to CONVERGED_RTOL iterations 0

#################################################################################
SOLV gmres iter 13
Relative error is 1.009408197(min 1.000000000, max 2.386602308), tril rel error 0.138442009(min 0.000000070, max 1.000000000)   (converged reason is CONVERGED_RTOL) iterations 0 time 0.000070 s(69632.13800000000628642738 us)
#################################################################################


startPC
startSolv
  0 KSP Residual norm 1.625794716222e+03 
  0 KSP preconditioned resid norm 1.625794716222e+03 true resid norm 5.591842695771e+04 ||r(i)||/||b|| 2.582799509625e-02
Linear solve converged due to CONVERGED_RTOL iterations 0

#################################################################################
SOLV gmres iter 14
Relative error is 1.009408197(min 1.000000000, max 2.386602308), tril rel error 0.138442009(min 0.000000070, max 1.000000000)   (converged reason is CONVERGED_RTOL) iterations 0 time 0.000073 s(73498.46099999999569263309 us)
#################################################################################

nohup: appending output to ?nohup.out?
nohup: failed to run command ?localc?: No such file or directory
**************************************** ***********************************************************************************************************************
***                                WIDEN YOUR WINDOW TO 160 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document                                 ***
****************************************************************************************************************************************************************

------------------------------------------------------------------ PETSc Performance Summary: -------------------------------------------------------------------

./petscTest on a  named head1.hpc with 36 processors, by lida Fri Jun  3 13:23:29 2022
Using Petsc Release Version 3.17.1, unknown 

                         Max       Max/Min     Avg       Total
Time (sec):           8.454e+01     1.440   5.941e+01
Objects:              7.030e+02     1.000   7.030e+02
Flops:                1.018e+09     2.522   5.062e+08  1.822e+10
Flops/sec:            1.734e+07     3.633   8.567e+06  3.084e+08
MPI Msg Count:        5.257e+04     1.584   4.249e+04  1.530e+06
MPI Msg Len (bytes):  1.453e+09    14.133   2.343e+04  3.585e+10
MPI Reductions:       7.800e+02     1.000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flop ------  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total    Count   %Total     Avg         %Total    Count   %Total
 0:      Main Stage: 5.9406e+01 100.0%  1.8223e+10 100.0%  1.530e+06 100.0%  2.343e+04      100.0%  7.620e+02  97.7%

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flop: Max - maximum over all processors
                  Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   AvgLen: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flop in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flop                              --- Global ---  --- Stage ----  Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   AvgLen  Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

BuildTwoSided         75 1.0 3.5652e+0155.4 0.00e+00 0.0 1.9e+04 8.0e+00 7.5e+01  8  0  1  0 10   8  0  1  0 10     0
BuildTwoSidedF        51 1.0 3.5557e+0157.2 0.00e+00 0.0 8.5e+03 3.8e+05 5.1e+01  8  0  1  9  7   8  0  1  9  7     0
MatMult             1503 1.0 2.8036e+00 1.3 6.78e+08 3.8 1.1e+06 2.4e+04 4.0e+00  4 53 74 75  1   4 53 74 75  1  3473
MatMultAdd           328 1.0 2.2706e-01 2.0 2.43e+07 2.3 1.5e+05 3.0e+03 0.0e+00  0  2 10  1  0   0  2 10  1  0  1985
MatMultTranspose     328 1.0 4.6323e-01 2.6 4.98e+07 4.7 1.5e+05 3.0e+03 4.0e+00  0  3 10  1  1   0  3 10  1  1  1090
MatSolve              82 1.0 2.3332e-04 2.0 1.23e+03 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   190
MatLUFactorSym         1 1.0 1.2696e-05 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatLUFactorNum         1 1.0 2.1874e-05 2.1 1.60e+01 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    26
MatConvert             5 1.0 3.0352e-02 1.2 0.00e+00 0.0 5.7e+03 9.8e+03 4.0e+00  0  0  0  0  1   0  0  0  0  1     0
MatScale              12 1.0 9.6534e-03 2.1 2.13e+06 4.1 2.8e+03 2.0e+04 0.0e+00  0  0  0  0  0   0  0  0  0  0  2795
MatResidual          328 1.0 5.8371e-01 1.2 1.50e+08 4.7 2.3e+05 2.0e+04 0.0e+00  1 10 15 13  0   1 10 15 13  0  3018
MatAssemblyBegin      70 1.0 3.5348e+0142.1 0.00e+00 0.0 8.5e+03 3.8e+05 2.4e+01  9  0  1  9  3   9  0  1  9  3     0
MatAssemblyEnd        70 1.0 2.0657e+00 1.0 7.97e+06573.7 0.0e+00 0.0e+00 8.1e+01  3  0  0  0 10   3  0  0  0 11     7
MatGetRowIJ            1 1.0 4.5262e-06 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCreateSubMats       1 1.0 3.6275e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00  0  0  0  0  1   0  0  0  0  1     0
MatCreateSubMat        2 1.0 2.2424e-03 1.1 0.00e+00 0.0 3.5e+01 1.2e+03 2.8e+01  0  0  0  0  4   0  0  0  0  4     0
MatGetOrdering         1 1.0 3.3737e-05 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCoarsen             4 1.0 2.7961e-01 2.3 0.00e+00 0.0 3.1e+04 5.1e+04 2.4e+01  0  0  2  4  3   0  0  2  4  3     0
MatZeroEntries         4 1.0 6.2166e-04176.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAXPY                4 1.0 1.0507e-02 1.1 2.81e+04 1.6 0.0e+00 0.0e+00 4.0e+00  0  0  0  0  1   0  0  0  0  1    63
MatTranspose           8 1.0 3.0698e-03 3.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMatMultSym         12 1.0 1.5384e-01 1.2 0.00e+00 0.0 8.5e+03 2.0e+04 3.6e+01  0  0  1  0  5   0  0  1  0  5     0
MatMatMultNum         12 1.0 7.4942e-02 1.3 1.29e+07 8.0 2.8e+03 2.0e+04 4.0e+00  0  1  0  0  1   0  1  0  0  1  1636
MatPtAPSymbolic        4 1.0 5.7252e-01 1.0 0.00e+00 0.0 1.4e+04 4.8e+04 2.8e+01  1  0  1  2  4   1  0  1  2  4     0
MatPtAPNumeric         4 1.0 9.4753e-01 1.0 3.99e+0714.7 3.7e+03 1.0e+05 2.0e+01  2  1  0  1  3   2  1  0  1  3   256
MatTrnMatMultSym       1 1.0 2.3274e+00 1.0 0.00e+00 0.0 5.6e+03 7.0e+05 1.2e+01  4  0  0 11  2   4  0  0 11  2     0
MatRedundantMat        1 1.0 3.9176e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00  0  0  0  0  1   0  0  0  0  1     0
MatMPIConcateSeq       1 1.0 3.0197e-05 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetLocalMat        13 1.0 8.8896e-03 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetBrAoCol         12 1.0 2.0289e-01 1.8 0.00e+00 0.0 2.0e+04 3.8e+04 0.0e+00  0  0  1  2  0   0  0  1  2  0     0
VecMDot               93 1.0 2.7982e-01 3.2 5.32e+07 1.0 0.0e+00 0.0e+00 9.3e+01  0 10  0  0 12   0 10  0  0 12  6711
VecNorm              209 1.0 3.7674e-01 4.3 6.40e+06 1.0 0.0e+00 0.0e+00 2.1e+02  0  1  0  0 27   0  1  0  0 27   591
VecScale             112 1.0 5.5057e-04 1.5 1.50e+06 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 91073
VecCopy             1071 1.0 1.4028e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet              1293 1.0 6.6862e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY               72 1.0 1.4381e-03 1.6 2.44e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 60574
VecAYPX             2036 1.0 2.0018e-02 1.7 1.96e+07 1.5 0.0e+00 0.0e+00 0.0e+00  0  3  0  0  0   0  3  0  0  0 23728
VecAXPBYCZ           656 1.0 7.2191e-03 1.7 2.30e+07 1.6 0.0e+00 0.0e+00 0.0e+00  0  3  0  0  0   0  3  0  0  0 74818
VecMAXPY             151 1.0 8.7415e-02 1.3 1.06e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0 21  0  0  0   0 21  0  0  0 43052
VecAssemblyBegin      28 1.0 2.6426e-01100.2 0.00e+00 0.0 0.0e+00 0.0e+00 2.7e+01  0  0  0  0  3   0  0  0  0  4     0
VecAssemblyEnd        28 1.0 4.3833e-05 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecPointwiseMult    1356 1.0 1.7490e-02 1.5 9.52e+06 1.6 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 12767
VecScatterBegin     2341 1.0 5.3816e-01 5.2 0.00e+00 0.0 1.5e+06 2.0e+04 1.6e+01  1  0 95 81  2   1  0 95 81  2     0
VecScatterEnd       2341 1.0 2.4118e+00 1.9 2.55e+07368.0 0.0e+00 0.0e+00 0.0e+00  3  0  0  0  0   3  0  0  0  0    23
VecNormalize         112 1.0 7.0541e-02 2.8 4.50e+06 1.1 0.0e+00 0.0e+00 1.1e+02  0  1  0  0 14   0  1  0  0 15  2132
SFSetGraph            31 1.0 2.5626e-0218.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFSetUp               24 1.0 1.0973e-01 1.6 0.00e+00 0.0 2.9e+04 1.5e+04 2.4e+01  0  0  2  1  3   0  0  2  1  3     0
SFBcastBegin          28 1.0 2.3203e-02 3.2 0.00e+00 0.0 2.5e+04 5.7e+04 0.0e+00  0  0  2  4  0   0  0  2  4  0     0
SFBcastEnd            28 1.0 7.4158e-02 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFPack              2369 1.0 4.4483e-0118.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
SFUnpack            2369 1.0 3.6313e-0271.6 2.55e+07368.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1497
KSPSetUp              11 1.0 5.1386e-03 5.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01  0  0  0  0  1   0  0  0  0  1     0
KSPSolve              15 1.0 3.4692e+00 1.0 9.40e+08 2.4 1.4e+06 1.9e+04 2.3e+02  6 95 90 73 30   6 95 90 73 30  4972
KSPGMRESOrthog        93 1.0 3.1573e-01 2.5 1.06e+08 1.0 0.0e+00 0.0e+00 9.3e+01  0 21  0  0 12   0 21  0  0 12 11896
PCGAMGGraph_AGG        4 1.0 3.0876e-01 1.0 1.83e+06 4.7 8.5e+03 1.3e+04 3.6e+01  1  0  1  0  5   1  0  1  0  5    70
PCGAMGCoarse_AGG       4 1.0 2.8281e+00 1.1 0.00e+00 0.0 4.8e+04 1.3e+05 4.7e+01  5  0  3 17  6   5  0  3 17  6     0
PCGAMGProl_AGG         4 1.0 3.2106e-01 1.8 0.00e+00 0.0 1.2e+04 2.4e+04 6.3e+01  0  0  1  1  8   0  0  1  1  8     0
PCGAMGPOpt_AGG         4 1.0 2.6704e-01 1.0 2.82e+07 3.0 4.5e+04 1.8e+04 1.6e+02  0  2  3  2 21   0  2  3  2 22  1589
GAMG: createProl       4 1.0 3.5902e+00 1.0 3.01e+07 3.0 1.1e+05 6.6e+04 3.1e+02  6  2  7 21 40   6  2  7 21 41   124
  Create Graph         4 1.0 3.0346e-02 1.2 0.00e+00 0.0 5.7e+03 9.8e+03 4.0e+00  0  0  0  0  1   0  0  0  0  1     0
  Filter Graph         4 1.0 2.8303e-01 1.0 1.83e+06 4.7 2.8e+03 2.0e+04 3.2e+01  0  0  0  0  4   0  0  0  0  4    76
  MIS/Agg              4 1.0 2.7965e-01 2.3 0.00e+00 0.0 3.1e+04 5.1e+04 2.4e+01  0  0  2  4  3   0  0  2  4  3     0
  SA: col data         4 1.0 2.1554e-02 1.2 0.00e+00 0.0 9.2e+03 2.9e+04 2.7e+01  0  0  1  1  3   0  0  1  1  4     0
  SA: frmProl0         4 1.0 1.5549e-01 1.0 0.00e+00 0.0 2.6e+03 5.5e+03 2.0e+01  0  0  0  0  3   0  0  0  0  3     0
  SA: smooth           4 1.0 1.8642e-01 1.0 2.16e+06 4.0 1.1e+04 2.0e+04 5.2e+01  0  0  1  1  7   0  0  1  1  7   148
GAMG: partLevel        4 1.0 1.5208e+00 1.0 3.99e+0714.7 1.8e+04 5.9e+04 1.0e+02  3  1  1  3 13   3  1  1  3 13   159
  repartition          1 1.0 3.5234e-03 1.0 0.00e+00 0.0 8.0e+01 5.3e+02 5.3e+01  0  0  0  0  7   0  0  0  0  7     0
  Invert-Sort          1 1.0 3.3143e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00  0  0  0  0  1   0  0  0  0  1     0
  Move A               1 1.0 1.1502e-03 1.2 0.00e+00 0.0 3.5e+01 1.2e+03 1.5e+01  0  0  0  0  2   0  0  0  0  2     0
  Move P               1 1.0 1.4414e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.6e+01  0  0  0  0  2   0  0  0  0  2     0
PCGAMG Squ l00         1 1.0 2.3274e+00 1.0 0.00e+00 0.0 5.6e+03 7.0e+05 1.2e+01  4  0  0 11  2   4  0  0 11  2     0
PCGAMG Gal l00         1 1.0 1.2443e+00 1.0 1.06e+07 5.0 9.0e+03 1.1e+05 1.2e+01  2  1  1  3  2   2  1  1  3  2   135
PCGAMG Opt l00         1 1.0 1.4166e-01 1.0 5.88e+05 1.7 4.5e+03 4.8e+04 1.0e+01  0  0  0  1  1   0  0  0  1  1   130
PCGAMG Gal l01         1 1.0 2.5946e-01 1.0 2.64e+07543.5 6.8e+03 1.2e+04 1.2e+01  0  0  0  0  2   0  0  0  0  2   271
PCGAMG Opt l01         1 1.0 2.9430e-02 1.0 1.11e+06444.1 4.9e+03 1.5e+03 1.0e+01  0  0  0  0  1   0  0  0  0  1    90
PCGAMG Gal l02         1 1.0 1.2971e-02 1.0 3.34e+06 0.0 2.1e+03 1.3e+03 1.2e+01  0  0  0  0  2   0  0  0  0  2   343
PCGAMG Opt l02         1 1.0 3.6016e-03 1.0 2.61e+05 0.0 2.0e+03 2.1e+02 1.0e+01  0  0  0  0  1   0  0  0  0  1    97
PCGAMG Gal l03         1 1.0 5.7189e-04 1.1 1.45e+04 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  2   0  0  0  0  2    25
PCGAMG Opt l03         1 1.0 4.1255e-04 1.1 5.02e+03 0.0 0.0e+00 0.0e+00 1.0e+01  0  0  0  0  1   0  0  0  0  1    12
PCSetUp                1 1.0 5.1101e+00 1.0 6.99e+07 5.5 1.3e+05 6.5e+04 4.6e+02  9  4  9 24 58   9  4  9 24 60   135
PCApply               82 1.0 2.5729e+00 1.0 7.17e+08 4.0 1.2e+06 1.6e+04 1.4e+01  4 49 80 53  2   4 49 80 53  2  3488
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

           Container     8              2         1248     0.
              Matrix   119             66     25244856     0.
      Matrix Coarsen     4              4         2688     0.
              Vector   402            293     26939160     0.
           Index Set    67             58       711824     0.
   Star Forest Graph    49             28        36128     0.
       Krylov Solver    11              4       124000     0.
      Preconditioner    11              4         3712     0.
              Viewer     1              0            0     0.
         PetscRandom     4              4         2840     0.
    Distributed Mesh     9              4        20512     0.
     Discrete System     9              4         4096     0.
           Weak Form     9              4         2656     0.
========================================================================================================================
Average time to get PetscTime(): 2.86847e-08
Average time for MPI_Barrier(): 1.14387e-05
Average time for zero size MPI_Send(): 3.53196e-06
#PETSc Option Table entries:
-ksp_converged_reason
-ksp_monitor
-ksp_monitor_true_residual
-log_view
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with 64 bit PetscInt
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8
Configure options: --with-python --prefix=/home/lida -with-mpi-dir=/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4 LDFLAGS="-L/home/lida/lib64 -L/home/lida/lib -L/home/lida/jdk/lib" CPPFLAGS="-I/home/lida/include -I/home/lida/jdk/include -march=native -O3" CXXFLAGS="-I/home/lida/include -I/home/lida/jdk/include -march=native -O3" CFLAGS="-I/home/lida/include -I/home/lida/jdk/include -march=native -O3" --with-debugging=no --with-64-bit-indices FOPTFLAGS="-O3 -march=native" --download-make
-----------------------------------------
Libraries compiled on 2022-05-25 10:03:14 on head1.hpc 
Machine characteristics: Linux-3.10.0-1062.el7.x86_64-x86_64-with-centos-7.7.1908-Core
Using PETSc directory: /home/lida
Using PETSc arch: 
-----------------------------------------

Using C compiler: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpicc -I/home/lida/include -I/home/lida/jdk/include -march=native -O3 -fPIC -I/home/lida/include -I/home/lida/jdk/include -march=native -O3  
Using Fortran compiler: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpif90  -fPIC -Wall -ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -O3 -march=native    -I/home/lida/include -I/home/lida/jdk/include -march=native -O3 
-----------------------------------------

Using include paths: -I/home/lida/include -I/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/include
-----------------------------------------

Using C linker: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpicc
Using Fortran linker: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpif90
Using libraries: -Wl,-rpath,/home/lida/lib -L/home/lida/lib -lpetsc -Wl,-rpath,/home/lida/lib64 -L/home/lida/lib64 -Wl,-rpath,/home/lida/lib -L/home/lida/lib -Wl,-rpath,/home/lida/jdk/lib -L/home/lida/jdk/lib -Wl,-rpath,/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/lib -L/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/lib -Wl,-rpath,/opt/ohpc/pub/compiler/gcc/8.3.0/lib/gcc/x86_64-pc-linux-gnu/8.3.0 -L/opt/ohpc/pub/compiler/gcc/8.3.0/lib/gcc/x86_64-pc-linux-gnu/8.3.0 -Wl,-rpath,/opt/ohpc/pub/compiler/gcc/8.3.0/lib64 -L/opt/ohpc/pub/compiler/gcc/8.3.0/lib64 -Wl,-rpath,/home/lida/intel/oneapi/mkl/2022.0.2/lib/intel64 -L/home/lida/intel/oneapi/mkl/2022.0.2/lib/intel64 -Wl,-rpath,/opt/software/intel/compilers_and_libraries_2020.2.254/linux/tbb/lib/intel64/gcc4.8 -L/opt/software/intel/compilers_and_libraries_2020.2.254/linux/tbb/lib/intel64/gcc4.8 -Wl,-rpath,/opt/ohpc/pub/compiler/gcc/8.3.0/lib -L/opt/ohpc/pub/compiler/gcc/8.3.0/lib -lopenblas -lm -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl
-----------------------------------------

[lida at head1 build]$ 

-------------- next part --------------
[lida at head1 petsc]$ export OMP_NUM_THREADS=1
[lida at head1 petsc]$ make streams NPMAX=8 2>/dev/null
/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpicc -o MPIVersion.o -c -I/home/lida/include -I/home/lida/jdk/include -march=native -O3 -fPIC -I/home/lida/include -I/home/lida/jdk/include -march=native -O3   -I/home/lida/Code/petsc/include -I/home/lida/Code/petsc/arch-linux-c-opt/include -I/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/include  -I/home/lida/include -I/home/lida/jdk/include -march=native -O3   `pwd`/MPIVersion.c
Running streams with '/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpiexec --oversubscribe ' using 'NPMAX=8' 
1  16106.3237   Rate (MB/s)
2  28660.2442   Rate (MB/s) 1.77944 
3  42041.2053   Rate (MB/s) 2.61023 
4  57109.2439   Rate (MB/s) 3.54577 
5  66797.5164   Rate (MB/s) 4.14729 
6  79516.0361   Rate (MB/s) 4.93695 
7  88664.6509   Rate (MB/s) 5.50497 
8 101902.1854   Rate (MB/s) 6.32685 
------------------------------------------------
Unable to open matplotlib to plot speedup
Unable to open matplotlib to plot speedup
-------------- next part --------------
A non-text attachment was scrubbed...
Name: time per iterations.png
Type: image/png
Size: 15274 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220603/c25582f5/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: without # 0,2,12 iterations.png
Type: image/png
Size: 17069 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220603/c25582f5/attachment-0003.png>

From mfadams at lbl.gov  Fri Jun  3 07:17:41 2022
From: mfadams at lbl.gov (Mark Adams)
Date: Fri, 3 Jun 2022 08:17:41 -0400
Subject: [petsc-users] Sparse linear system solving
In-Reply-To: <5475809d-16b1-e6a8-72b8-c4605f321e8e@mail.ioffe.ru>
References: <026d55af-c978-81da-f571-46519e5e6f8e@mail.ioffe.ru>
	<CAMYG4Gn=ngVrYCyjgkBgzWO+3eVFhwJ6xhqb_XWhAocbaWM_pA@mail.gmail.com>
	<CADOhEh6uO9-iH6K0Wy-0ZeJv4TwMkYauFfqB-ceiMtP3PgvvHg@mail.gmail.com>
	<2e7ebbf1-511a-7055-ff92-131d3bf73f1e@mail.ioffe.ru>
	<CAMYG4Gk8V6AJdQzjcwX-t_TVf5cSgSs6EQVAZXaQZ8GSsr7iXQ@mail.gmail.com>
	<CAMYG4G=2+46y86OfYccv4UJV991gyOJndaZC_7UuZM-aanDnOA@mail.gmail.com>
	<201b6c28-2616-edaa-21dd-ee91a257c432@mail.ioffe.ru>
	<CAMYG4G=mrfv=sm9Ux5kvKZ9XvoWn4K-Ubm-N3mc3pUfFdQt5_Q@mail.gmail.com>
	<5475809d-16b1-e6a8-72b8-c4605f321e8e@mail.ioffe.ru>
Message-ID: <CADOhEh5QvPeO+jc=4Kprm5X+vwTwusH3-AK+EtbcvrZF2QWNrw@mail.gmail.com>

Your timing data in the first plot seems to have random integers (2,1,1)
added to random iterations (0,2,12).
Perhaps there is a bug in your test setup?

Mark

On Fri, Jun 3, 2022 at 6:42 AM Lidia <lidia.varsh at mail.ioffe.ru> wrote:

> Dear Matt, Barry,
>
> thank you for the information about openMP!
>
> Now all processes are loaded well. But we see a strange behaviour of
> running times at different iterations, see description below. Could you
> please explain us the reason and how we can improve it?
>
> We need to quickly solve a big (about 1e6 rows) square sparse
> non-symmetric matrix many times (about 1e5 times) consequently. Matrix is
> constant at every iteration, and the right-side vector B is slowly changed
> (we think that its change at every iteration should be less then 0.001 %).
> So we use every previous solution vector X as an initial guess for the next
> iteration. AMG preconditioner and GMRES solver are used.
>
> We have tested the code using a matrix with 631 000 rows, during 15
> consequent iterations, using vector X from the previous iterations.
> Right-side vector B and matrix A are constant during the whole running. The
> time of the first iteration is large (about 2 seconds) and is quickly
> decreased to the next iterations (average time of last iterations were
> about 0.00008 s). But some iterations in the middle (# 2 and # 12) have
> huge time - 0.999063 second (see the figure with time dynamics attached).
> This time of 0.999 second does not depend on the size of a matrix, on the
> number of MPI processes, these time jumps also exist if we vary vector B.
> Why these time jumps appear and how we can avoid them?
>
> The ksp_monitor out for this running (included 15 iterations) using 36 MPI
> processes and a file with the memory bandwidth information (testSpeed) are
> also attached. We can provide our C++ script if it is needed.
>
> Thanks a lot!
> Best,
> Lidiia
>
>
>
> On 01.06.2022 21:14, Matthew Knepley wrote:
>
> On Wed, Jun 1, 2022 at 1:43 PM Lidia <lidia.varsh at mail.ioffe.ru> wrote:
>
>> Dear Matt,
>>
>> Thank you for the rule of 10,000 variables per process! We have run ex.5
>> with matrix 1e4 x 1e4 at our cluster and got a good performance dynamics
>> (see the figure "performance.png" - dependency of the solving time in
>> seconds on the number of cores). We have used GAMG preconditioner
>> (multithread: we have added the option "
>> -pc_gamg_use_parallel_coarse_grid_solver") and GMRES solver. And we have
>> set one openMP thread to every MPI process. Now the ex.5 is working good on
>> many mpi processes! But the running uses about 100 GB of RAM.
>>
>> How we can run ex.5 using many openMP threads without mpi? If we just
>> change the running command, the cores are not loaded normally: usually just
>> one core is loaded in 100 % and others are idle. Sometimes all cores are
>> working in 100 % during 1 second but then again become idle about 30
>> seconds. Can the preconditioner use many threads and how to activate this
>> option?
>>
>
> Maye you could describe what you are trying to accomplish? Threads and
> processes are not really different, except for memory sharing. However,
> sharing large complex data structures rarely works. That is why they get
> partitioned and operate effectively as distributed memory. You would not
> really save memory by using
> threads in this instance, if that is your goal. This is detailed in the
> talks in this session (see 2016 PP Minisymposium on this page
> https://cse.buffalo.edu/~knepley/relacs.html).
>
>   Thanks,
>
>      Matt
>
>
>> The solving times (the time of the solver work) using 60 openMP threads
>> is 511 seconds now, and while using 60 MPI processes - 13.19 seconds.
>>
>> ksp_monitor outs for both cases (many openMP threads or many MPI
>> processes) are attached.
>>
>>
>> Thank you!
>> Best,
>> Lidia
>>
>> On 31.05.2022 15:21, Matthew Knepley wrote:
>>
>> I have looked at the local logs. First, you have run problems of size 12
>> and 24. As a rule of thumb, you need 10,000
>> variables per process in order to see good speedup.
>>
>>   Thanks,
>>
>>      Matt
>>
>> On Tue, May 31, 2022 at 8:19 AM Matthew Knepley <knepley at gmail.com>
>> wrote:
>>
>>> On Tue, May 31, 2022 at 7:39 AM Lidia <lidia.varsh at mail.ioffe.ru> wrote:
>>>
>>>> Matt, Mark, thank you much for your answers!
>>>>
>>>>
>>>> Now we have run example # 5 on our computer cluster and on the local
>>>> server and also have not seen any performance increase, but by unclear
>>>> reason running times on the local server are much better than on the
>>>> cluster.
>>>>
>>> I suspect that you are trying to get speedup without increasing the
>>> memory bandwidth:
>>>
>>>
>>> https://petsc.org/main/faq/#what-kind-of-parallel-computers-or-clusters-are-needed-to-use-petsc-or-why-do-i-get-little-speedup
>>>
>>>   Thanks,
>>>
>>>      Matt
>>>
>>>> Now we will try to run petsc #5 example inside a docker container on
>>>> our server and see if the problem is in our environment. I'll write you the
>>>> results of this test as soon as we get it.
>>>>
>>>> The ksp_monitor outs for the 5th test at the current local server
>>>> configuration (for 2 and 4 mpi processes) and for the cluster (for 1 and 3
>>>> mpi processes) are attached .
>>>>
>>>>
>>>> And one more question. Potentially we can use 10 nodes and 96 threads
>>>> at each node on our cluster. What do you think, which combination of
>>>> numbers of mpi processes and openmp threads may be the best for the 5th
>>>> example?
>>>>
>>>> Thank you!
>>>>
>>>>
>>>> Best,
>>>> Lidiia
>>>>
>>>> On 31.05.2022 05:42, Mark Adams wrote:
>>>>
>>>> And if you see "NO" change in performance I suspect the solver/matrix
>>>> is all on one processor.
>>>> (PETSc does not use threads by default so threads should not change
>>>> anything).
>>>>
>>>> As Matt said, it is best to start with a PETSc example that does
>>>> something like what you want (parallel linear solve, see
>>>> src/ksp/ksp/tutorials for examples), and then add your code to it.
>>>> That way you get the basic infrastructure in place for you, which is
>>>> pretty obscure to the uninitiated.
>>>>
>>>> Mark
>>>>
>>>> On Mon, May 30, 2022 at 10:18 PM Matthew Knepley <knepley at gmail.com>
>>>> wrote:
>>>>
>>>>> On Mon, May 30, 2022 at 10:12 PM Lidia <lidia.varsh at mail.ioffe.ru>
>>>>> wrote:
>>>>>
>>>>>> Dear colleagues,
>>>>>>
>>>>>> Is here anyone who have solved big sparse linear matrices using PETSC?
>>>>>>
>>>>>
>>>>> There are lots of publications with this kind of data. Here is one
>>>>> recent one: https://arxiv.org/abs/2204.01722
>>>>>
>>>>>
>>>>>> We have found NO performance improvement while using more and more
>>>>>> mpi
>>>>>> processes (1-2-3) and open-mp threads (from 1 to 72 threads). Did
>>>>>> anyone
>>>>>> faced to this problem? Does anyone know any possible reasons of such
>>>>>> behaviour?
>>>>>>
>>>>>
>>>>> Solver behavior is dependent on the input matrix. The only
>>>>> general-purpose solvers
>>>>> are direct, but they do not scale linearly and have high memory
>>>>> requirements.
>>>>>
>>>>> Thus, in order to make progress you will have to be specific about
>>>>> your matrices.
>>>>>
>>>>>
>>>>>> We use AMG preconditioner and GMRES solver from KSP package, as our
>>>>>> matrix is large (from 100 000 to 1e+6 rows and columns), sparse,
>>>>>> non-symmetric and includes both positive and negative values. But
>>>>>> performance problems also exist while using CG solvers with symmetric
>>>>>> matrices.
>>>>>>
>>>>>
>>>>> There are many PETSc examples, such as example 5 for the Laplacian,
>>>>> that exhibit
>>>>> good scaling with both AMG and GMG.
>>>>>
>>>>>
>>>>>> Could anyone help us to set appropriate options of the preconditioner
>>>>>> and solver? Now we use default parameters, maybe they are not the
>>>>>> best,
>>>>>> but we do not know a good combination. Or maybe you could suggest any
>>>>>> other pairs of preconditioner+solver for such tasks?
>>>>>>
>>>>>> I can provide more information: the matrices that we solve, c++
>>>>>> script
>>>>>> to run solving using petsc and any statistics obtained by our runs.
>>>>>>
>>>>>
>>>>> First, please provide a description of the linear system, and the
>>>>> output of
>>>>>
>>>>>   -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -log_view
>>>>>
>>>>> for each test case.
>>>>>
>>>>>   Thanks,
>>>>>
>>>>>      Matt
>>>>>
>>>>>
>>>>>> Thank you in advance!
>>>>>>
>>>>>> Best regards,
>>>>>> Lidiia Varshavchik,
>>>>>> Ioffe Institute, St. Petersburg, Russia
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> What most experimenters take for granted before they begin their
>>>>> experiments is infinitely more interesting than any results to which their
>>>>> experiments lead.
>>>>> -- Norbert Wiener
>>>>>
>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>
>>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://www.cse.buffalo.edu/~knepley/
>>> <http://www.cse.buffalo.edu/~knepley/>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220603/a1455232/attachment-0001.html>

From knepley at gmail.com  Fri Jun  3 07:19:03 2022
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 3 Jun 2022 08:19:03 -0400
Subject: [petsc-users] Sparse linear system solving
In-Reply-To: <5475809d-16b1-e6a8-72b8-c4605f321e8e@mail.ioffe.ru>
References: <026d55af-c978-81da-f571-46519e5e6f8e@mail.ioffe.ru>
	<CAMYG4Gn=ngVrYCyjgkBgzWO+3eVFhwJ6xhqb_XWhAocbaWM_pA@mail.gmail.com>
	<CADOhEh6uO9-iH6K0Wy-0ZeJv4TwMkYauFfqB-ceiMtP3PgvvHg@mail.gmail.com>
	<2e7ebbf1-511a-7055-ff92-131d3bf73f1e@mail.ioffe.ru>
	<CAMYG4Gk8V6AJdQzjcwX-t_TVf5cSgSs6EQVAZXaQZ8GSsr7iXQ@mail.gmail.com>
	<CAMYG4G=2+46y86OfYccv4UJV991gyOJndaZC_7UuZM-aanDnOA@mail.gmail.com>
	<201b6c28-2616-edaa-21dd-ee91a257c432@mail.ioffe.ru>
	<CAMYG4G=mrfv=sm9Ux5kvKZ9XvoWn4K-Ubm-N3mc3pUfFdQt5_Q@mail.gmail.com>
	<5475809d-16b1-e6a8-72b8-c4605f321e8e@mail.ioffe.ru>
Message-ID: <CAMYG4Gm+BCLxfyL4Q22zSgsxLbZvWbu7LiaQqui8sNhU4rofOg@mail.gmail.com>

On Fri, Jun 3, 2022 at 6:42 AM Lidia <lidia.varsh at mail.ioffe.ru> wrote:

> Dear Matt, Barry,
>
> thank you for the information about openMP!
>
> Now all processes are loaded well. But we see a strange behaviour of
> running times at different iterations, see description below. Could you
> please explain us the reason and how we can improve it?
>
> We need to quickly solve a big (about 1e6 rows) square sparse
> non-symmetric matrix many times (about 1e5 times) consequently. Matrix is
> constant at every iteration, and the right-side vector B is slowly changed
> (we think that its change at every iteration should be less then 0.001 %).
> So we use every previous solution vector X as an initial guess for the next
> iteration. AMG preconditioner and GMRES solver are used.
>
> We have tested the code using a matrix with 631 000 rows, during 15
> consequent iterations, using vector X from the previous iterations.
> Right-side vector B and matrix A are constant during the whole running. The
> time of the first iteration is large (about 2 seconds) and is quickly
> decreased to the next iterations (average time of last iterations were
> about 0.00008 s). But some iterations in the middle (# 2 and # 12) have
> huge time - 0.999063 second (see the figure with time dynamics attached).
> This time of 0.999 second does not depend on the size of a matrix, on the
> number of MPI processes, these time jumps also exist if we vary vector B.
> Why these time jumps appear and how we can avoid them?
>
>
PETSc is not taking this time. It must come from somewhere else in your
code. Notice that no iterations are taken for any subsequent solves, so no
operations other than the residual norm check (and preconditioner
application) are being performed.

  Thanks,

     Matt


> The ksp_monitor out for this running (included 15 iterations) using 36 MPI
> processes and a file with the memory bandwidth information (testSpeed) are
> also attached. We can provide our C++ script if it is needed.
>
> Thanks a lot!
> Best,
> Lidiia
>
>
>
> On 01.06.2022 21:14, Matthew Knepley wrote:
>
> On Wed, Jun 1, 2022 at 1:43 PM Lidia <lidia.varsh at mail.ioffe.ru> wrote:
>
>> Dear Matt,
>>
>> Thank you for the rule of 10,000 variables per process! We have run ex.5
>> with matrix 1e4 x 1e4 at our cluster and got a good performance dynamics
>> (see the figure "performance.png" - dependency of the solving time in
>> seconds on the number of cores). We have used GAMG preconditioner
>> (multithread: we have added the option "
>> -pc_gamg_use_parallel_coarse_grid_solver") and GMRES solver. And we have
>> set one openMP thread to every MPI process. Now the ex.5 is working good on
>> many mpi processes! But the running uses about 100 GB of RAM.
>>
>> How we can run ex.5 using many openMP threads without mpi? If we just
>> change the running command, the cores are not loaded normally: usually just
>> one core is loaded in 100 % and others are idle. Sometimes all cores are
>> working in 100 % during 1 second but then again become idle about 30
>> seconds. Can the preconditioner use many threads and how to activate this
>> option?
>>
>
> Maye you could describe what you are trying to accomplish? Threads and
> processes are not really different, except for memory sharing. However,
> sharing large complex data structures rarely works. That is why they get
> partitioned and operate effectively as distributed memory. You would not
> really save memory by using
> threads in this instance, if that is your goal. This is detailed in the
> talks in this session (see 2016 PP Minisymposium on this page
> https://cse.buffalo.edu/~knepley/relacs.html).
>
>   Thanks,
>
>      Matt
>
>
>> The solving times (the time of the solver work) using 60 openMP threads
>> is 511 seconds now, and while using 60 MPI processes - 13.19 seconds.
>>
>> ksp_monitor outs for both cases (many openMP threads or many MPI
>> processes) are attached.
>>
>>
>> Thank you!
>> Best,
>> Lidia
>>
>> On 31.05.2022 15:21, Matthew Knepley wrote:
>>
>> I have looked at the local logs. First, you have run problems of size 12
>> and 24. As a rule of thumb, you need 10,000
>> variables per process in order to see good speedup.
>>
>>   Thanks,
>>
>>      Matt
>>
>> On Tue, May 31, 2022 at 8:19 AM Matthew Knepley <knepley at gmail.com>
>> wrote:
>>
>>> On Tue, May 31, 2022 at 7:39 AM Lidia <lidia.varsh at mail.ioffe.ru> wrote:
>>>
>>>> Matt, Mark, thank you much for your answers!
>>>>
>>>>
>>>> Now we have run example # 5 on our computer cluster and on the local
>>>> server and also have not seen any performance increase, but by unclear
>>>> reason running times on the local server are much better than on the
>>>> cluster.
>>>>
>>> I suspect that you are trying to get speedup without increasing the
>>> memory bandwidth:
>>>
>>>
>>> https://petsc.org/main/faq/#what-kind-of-parallel-computers-or-clusters-are-needed-to-use-petsc-or-why-do-i-get-little-speedup
>>>
>>>   Thanks,
>>>
>>>      Matt
>>>
>>>> Now we will try to run petsc #5 example inside a docker container on
>>>> our server and see if the problem is in our environment. I'll write you the
>>>> results of this test as soon as we get it.
>>>>
>>>> The ksp_monitor outs for the 5th test at the current local server
>>>> configuration (for 2 and 4 mpi processes) and for the cluster (for 1 and 3
>>>> mpi processes) are attached .
>>>>
>>>>
>>>> And one more question. Potentially we can use 10 nodes and 96 threads
>>>> at each node on our cluster. What do you think, which combination of
>>>> numbers of mpi processes and openmp threads may be the best for the 5th
>>>> example?
>>>>
>>>> Thank you!
>>>>
>>>>
>>>> Best,
>>>> Lidiia
>>>>
>>>> On 31.05.2022 05:42, Mark Adams wrote:
>>>>
>>>> And if you see "NO" change in performance I suspect the solver/matrix
>>>> is all on one processor.
>>>> (PETSc does not use threads by default so threads should not change
>>>> anything).
>>>>
>>>> As Matt said, it is best to start with a PETSc example that does
>>>> something like what you want (parallel linear solve, see
>>>> src/ksp/ksp/tutorials for examples), and then add your code to it.
>>>> That way you get the basic infrastructure in place for you, which is
>>>> pretty obscure to the uninitiated.
>>>>
>>>> Mark
>>>>
>>>> On Mon, May 30, 2022 at 10:18 PM Matthew Knepley <knepley at gmail.com>
>>>> wrote:
>>>>
>>>>> On Mon, May 30, 2022 at 10:12 PM Lidia <lidia.varsh at mail.ioffe.ru>
>>>>> wrote:
>>>>>
>>>>>> Dear colleagues,
>>>>>>
>>>>>> Is here anyone who have solved big sparse linear matrices using PETSC?
>>>>>>
>>>>>
>>>>> There are lots of publications with this kind of data. Here is one
>>>>> recent one: https://arxiv.org/abs/2204.01722
>>>>>
>>>>>
>>>>>> We have found NO performance improvement while using more and more
>>>>>> mpi
>>>>>> processes (1-2-3) and open-mp threads (from 1 to 72 threads). Did
>>>>>> anyone
>>>>>> faced to this problem? Does anyone know any possible reasons of such
>>>>>> behaviour?
>>>>>>
>>>>>
>>>>> Solver behavior is dependent on the input matrix. The only
>>>>> general-purpose solvers
>>>>> are direct, but they do not scale linearly and have high memory
>>>>> requirements.
>>>>>
>>>>> Thus, in order to make progress you will have to be specific about
>>>>> your matrices.
>>>>>
>>>>>
>>>>>> We use AMG preconditioner and GMRES solver from KSP package, as our
>>>>>> matrix is large (from 100 000 to 1e+6 rows and columns), sparse,
>>>>>> non-symmetric and includes both positive and negative values. But
>>>>>> performance problems also exist while using CG solvers with symmetric
>>>>>> matrices.
>>>>>>
>>>>>
>>>>> There are many PETSc examples, such as example 5 for the Laplacian,
>>>>> that exhibit
>>>>> good scaling with both AMG and GMG.
>>>>>
>>>>>
>>>>>> Could anyone help us to set appropriate options of the preconditioner
>>>>>> and solver? Now we use default parameters, maybe they are not the
>>>>>> best,
>>>>>> but we do not know a good combination. Or maybe you could suggest any
>>>>>> other pairs of preconditioner+solver for such tasks?
>>>>>>
>>>>>> I can provide more information: the matrices that we solve, c++
>>>>>> script
>>>>>> to run solving using petsc and any statistics obtained by our runs.
>>>>>>
>>>>>
>>>>> First, please provide a description of the linear system, and the
>>>>> output of
>>>>>
>>>>>   -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -log_view
>>>>>
>>>>> for each test case.
>>>>>
>>>>>   Thanks,
>>>>>
>>>>>      Matt
>>>>>
>>>>>
>>>>>> Thank you in advance!
>>>>>>
>>>>>> Best regards,
>>>>>> Lidiia Varshavchik,
>>>>>> Ioffe Institute, St. Petersburg, Russia
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> What most experimenters take for granted before they begin their
>>>>> experiments is infinitely more interesting than any results to which their
>>>>> experiments lead.
>>>>> -- Norbert Wiener
>>>>>
>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>
>>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://www.cse.buffalo.edu/~knepley/
>>> <http://www.cse.buffalo.edu/~knepley/>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220603/646812cc/attachment-0001.html>

From Arne.Morten.Kvarving at sintef.no  Fri Jun  3 08:09:13 2022
From: Arne.Morten.Kvarving at sintef.no (Arne Morten Kvarving)
Date: Fri, 3 Jun 2022 13:09:13 +0000
Subject: [petsc-users] MatSchurComplementGetPmat voes
Message-ID: <OL1P279MB0242E75414DEC0D68639F792D9A19@OL1P279MB0242.NORP279.PROD.OUTLOOK.COM>

Hi!

I have a Chorin pressure correction solver with consistent pressure update, i.e.
pressure solve is based on the Schur complement

E = -A10*ainv(A00)*A01

with A10 = divergence, A00 the mass matrix and A01 the gradient.

I have had this implemented with petsc for a long time and it's working fine. However, I've done the schur-complement manually, ie using a MatShell.

I now wanted to see if I can implement this using the petsc facilities for the schur-complement, but I get a confusing error when I call MatSchurComplementGetPmat().

-----

Code snippet:

 MatCreateSchurComplement(m_blocks[0], m_blocks[0], m_blocks[1], m_blocks[2], nullptr, &E_operator);
< ... setup the ksp for A00 >
 MatSchurComplementSetAinvType(E_operator, MAT_SCHUR_COMPLEMENT_AINV_DIAG);
MatView(E_operator);
MatSchurComplementGetPmat(E_operator, MAT_INITIAL_MATRIX, &E_pc);

-----

This yields the output (I cut out the matrix elements):
Mat Object: 1 MPI processes
  type: schurcomplement
  Schur complement A11 - A10 inv(A00) A01
  A11 = 0
  A10
    Mat Object: 1 MPI processes
      type: seqaij
 KSP of A00
    KSP Object: 1 MPI processes
      type: preonly
      maximum iterations=1000, initial guess is zero
      tolerances:  relative=1e-06, absolute=1e-20, divergence=1e+06
      left preconditioning
      using DEFAULT norm type for convergence test
    PC Object: 1 MPI processes
      type: lu
        out-of-place factorization
        tolerance for zero pivot 2.22045e-14
        matrix ordering: nd
        factor fill ratio given 5., needed 1.02768
          Factored matrix follows:
            Mat Object: 1 MPI processes
              type: seqaij
              rows=72, cols=72
              package used to perform factorization: petsc
              total: nonzeros=4752, allocated nonzeros=4752
                using I-node routines: found 22 nodes, limit used is 5
      linear system matrix = precond matrix:
      Mat Object: 1 MPI processes
        type: seqaij
        rows=72, cols=72
        total: nonzeros=4624, allocated nonzeros=5184
        total number of mallocs used during MatSetValues calls=0
          using I-node routines: found 24 nodes, limit used is 5
  A01
    Mat Object: 1 MPI processes
      type: seqaij

[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Invalid argument
[0]PETSC ERROR: Wrong type of object: Parameter # 1
[0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.17.2, Jun 02, 2022
[0]PETSC ERROR: ../d/bin/Stokes on a linux-gnu-cxx-opt named akvalung by akva Fri Jun  3 14:48:06 2022
[0]PETSC ERROR: Configure options --with-mpi=0 --with-lapack-lib=-llapack --with-64-bit-indices=0 --with-shared-libraries=0 --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 --FOPTFLAGS=-O3 --with-blas-lib=-lblas --CFLAGS=-fPIC --CXXFLAGS=-fPIC --FFLAGS=-fPIC
[0]PETSC ERROR: #1 MatDestroy() at /home/akva/kode/petsc/petsc-3.17.2/src/mat/interface/matrix.c:1235
[0]PETSC ERROR: #2 MatCreateSchurComplementPmat() at /home/akva/kode/petsc/petsc-3.17.2/src/ksp/ksp/utils/schurm/schurm.c:763
[0]PETSC ERROR: #3 MatSchurComplementGetPmat_Basic() at /home/akva/kode/petsc/petsc-3.17.2/src/ksp/ksp/utils/schurm/schurm.c:785
[0]PETSC ERROR: #4 MatSchurComplementGetPmat() at /home/akva/kode/petsc/petsc-3.17.2/src/ksp/ksp/utils/schurm/schurm.c:835

where the errors come from the call call to obtain the preconditioner matrix.
I don't see what I've done wrong, as far as I can see it's all following https://petsc.org/release/docs/manualpages/KSP/MatCreateSchurComplement.html#MatCreateSchurComplement
MatCreateSchurComplement - Argonne National Laboratory<https://petsc.org/release/docs/manualpages/KSP/MatCreateSchurComplement.html#MatCreateSchurComplement>
Notes The Schur complement is NOT explicitly formed! Rather, this function returns a virtual Schur complement that can compute the matrix-vector product by using formula S = A11 - A10 A^{-1} A01 for Schur complement S and a KSP solver to approximate the action of A^{-1}.. All four matrices must have the same MPI communicator.
petsc.org
?

and
 https://petsc.org/release/docs/manualpages/KSP/MatSchurComplementGetPmat.html#MatSchurComplementGetPmat

Looking into the code it seems to try to call MatDestroy() for the Sp matrix but as Sp has not been set up it fails (schurm.c:763)
Removing that call as a test, it seems to succeed and I get the same solution as I do
with my manual code.

I'm sure I have done something stupid but I cannot see what, so any pointers would be appreciated.

cheers

arnem

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220603/405da78e/attachment-0001.html>

From knepley at gmail.com  Fri Jun  3 08:43:58 2022
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 3 Jun 2022 09:43:58 -0400
Subject: [petsc-users] MatSchurComplementGetPmat voes
In-Reply-To: <OL1P279MB0242E75414DEC0D68639F792D9A19@OL1P279MB0242.NORP279.PROD.OUTLOOK.COM>
References: <OL1P279MB0242E75414DEC0D68639F792D9A19@OL1P279MB0242.NORP279.PROD.OUTLOOK.COM>
Message-ID: <CAMYG4GmMx-GbwCdHnNRwkhqedKN8S12A91W-WOa_U63Ngeu6Kg@mail.gmail.com>

On Fri, Jun 3, 2022 at 9:09 AM Arne Morten Kvarving via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Hi!
>
> I have a Chorin pressure correction solver with consistent pressure
> update, i.e.
> pressure solve is based on the Schur complement
>
> E = -A10*ainv(A00)*A01
>
> with A10 = divergence, A00 the mass matrix and A01 the gradient.
>
> I have had this implemented with petsc for a long time and it's working
> fine. However, I've done the schur-complement manually, ie using a MatShell.
>
> I now wanted to see if I can implement this using the petsc facilities for
> the schur-complement, but I get a confusing error when I call
> MatSchurComplementGetPmat().
>
> -----
>
> Code snippet:
>
>  MatCreateSchurComplement(m_blocks[0], m_blocks[0], m_blocks[1],
> m_blocks[2], nullptr, &E_operator);
> < ... setup the ksp for A00 >
>  MatSchurComplementSetAinvType(E_operator, MAT_SCHUR_COMPLEMENT_AINV_DIAG);
> MatView(E_operator);
> MatSchurComplementGetPmat(E_operator, MAT_INITIAL_MATRIX, &E_pc);
>
> -----
>
> This yields the output (I cut out the matrix elements):
> Mat Object: 1 MPI processes
>   type: schurcomplement
>   Schur complement A11 - A10 inv(A00) A01
>   A11 = 0
>   A10
>     Mat Object: 1 MPI processes
>       type: seqaij
>  KSP of A00
>     KSP Object: 1 MPI processes
>       type: preonly
>       maximum iterations=1000, initial guess is zero
>       tolerances:  relative=1e-06, absolute=1e-20, divergence=1e+06
>       left preconditioning
>       using DEFAULT norm type for convergence test
>     PC Object: 1 MPI processes
>       type: lu
>         out-of-place factorization
>         tolerance for zero pivot 2.22045e-14
>         matrix ordering: nd
>         factor fill ratio given 5., needed 1.02768
>           Factored matrix follows:
>             Mat Object: 1 MPI processes
>               type: seqaij
>               rows=72, cols=72
>               package used to perform factorization: petsc
>               total: nonzeros=4752, allocated nonzeros=4752
>                 using I-node routines: found 22 nodes, limit used is 5
>       linear system matrix = precond matrix:
>       Mat Object: 1 MPI processes
>         type: seqaij
>         rows=72, cols=72
>         total: nonzeros=4624, allocated nonzeros=5184
>         total number of mallocs used during MatSetValues calls=0
>           using I-node routines: found 24 nodes, limit used is 5
>   A01
>     Mat Object: 1 MPI processes
>       type: seqaij
>
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [0]PETSC ERROR: Invalid argument
> [0]PETSC ERROR: Wrong type of object: Parameter # 1
> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.17.2, Jun 02, 2022
> [0]PETSC ERROR: ../d/bin/Stokes on a linux-gnu-cxx-opt named akvalung by
> akva Fri Jun  3 14:48:06 2022
> [0]PETSC ERROR: Configure options --with-mpi=0 --with-lapack-lib=-llapack
> --with-64-bit-indices=0 --with-shared-libraries=0 --COPTFLAGS=-O3
> --CXXOPTFLAGS=-O3 --FOPTFLAGS=-O3 --with-blas-lib=-lblas --CFLAGS=-fPIC
> --CXXFLAGS=-fPIC --FFLAGS=-fPIC
> [0]PETSC ERROR: #1 MatDestroy() at
> /home/akva/kode/petsc/petsc-3.17.2/src/mat/interface/matrix.c:1235
> [0]PETSC ERROR: #2 MatCreateSchurComplementPmat() at
> /home/akva/kode/petsc/petsc-3.17.2/src/ksp/ksp/utils/schurm/schurm.c:763
> [0]PETSC ERROR: #3 MatSchurComplementGetPmat_Basic() at
> /home/akva/kode/petsc/petsc-3.17.2/src/ksp/ksp/utils/schurm/schurm.c:785
> [0]PETSC ERROR: #4 MatSchurComplementGetPmat() at
> /home/akva/kode/petsc/petsc-3.17.2/src/ksp/ksp/utils/schurm/schurm.c:835
>
> where the errors come from the call call to obtain the preconditioner
> matrix.
> I don't see what I've done wrong, as far as I can see it's all following
> https://petsc.org/release/docs/manualpages/KSP/MatCreateSchurComplement.html#MatCreateSchurComplement
> MatCreateSchurComplement - Argonne National Laboratory
> <https://petsc.org/release/docs/manualpages/KSP/MatCreateSchurComplement.html#MatCreateSchurComplement>
> Notes The Schur complement is NOT explicitly formed! Rather, this function
> returns a virtual Schur complement that can compute the matrix-vector
> product by using formula S = A11 - A10 A^{-1} A01 for Schur complement S
> and a KSP solver to approximate the action of A^{-1}.. All four matrices
> must have the same MPI communicator.
> petsc.org
> *?*
>
> and
>
> https://petsc.org/release/docs/manualpages/KSP/MatSchurComplementGetPmat.html#MatSchurComplementGetPmat
>
> Looking into the code it seems to try to call MatDestroy() for the Sp
> matrix but as Sp has not been set up it fails (schurm.c:763)
> Removing that call as a test, it seems to succeed and I get the same
> solution as I do
> with my manual code.
>
> I'm sure I have done something stupid but I cannot see what, so any
> pointers would be appreciated.
>

This is not your fault. If the flag is MAT_INITIAL_MATRIX, we are expecting
the pointer to be initialized to NULL, but we never state this.
I think if you do this, the code will start working. I will fix GetPmat()
so that it does this automatically.

  Thanks,

     Matt


> cheers
>
> arnem
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220603/3c1f5141/attachment.html>

From balay at mcs.anl.gov  Fri Jun  3 09:19:39 2022
From: balay at mcs.anl.gov (Satish Balay)
Date: Fri, 3 Jun 2022 09:19:39 -0500 (CDT)
Subject: [petsc-users] petsc-3.17.2 now available
Message-ID: <45e33633-8989-4793-4e82-8fad84be81@mcs.anl.gov>

Dear PETSc users,

The patch release petsc-3.17.2 is now available for download.

http://www.mcs.anl.gov/petsc/download/index.html

Satish



From jsfaraway at gmail.com  Fri Jun  3 11:50:50 2022
From: jsfaraway at gmail.com (jsfaraway)
Date: Sat, 4 Jun 2022 00:50:50 +0800
Subject: [petsc-users] SLEPc EPSGD: too much time in single iteration
Message-ID: <4C029833-EC0F-494F-911F-D795375718D9@gmail.com>

An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220604/080e02b9/attachment-0001.html>

From jsfaraway at gmail.com  Fri Jun  3 12:18:55 2022
From: jsfaraway at gmail.com (=?UTF-8?B?UnVuZmVuZyBKaW4=?=)
Date: Sat, 4 Jun 2022 01:18:55 +0800 (GMT+08:00)
Subject: [petsc-users] SLEPc EPSGD: too much time in single iteration
Message-ID: <629a4282.1c69fb81.f0697.c92d@mx.google.com>

An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220604/0803c482/attachment.html>

From jroman at dsic.upv.es  Fri Jun  3 12:37:21 2022
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Fri, 3 Jun 2022 19:37:21 +0200
Subject: [petsc-users] SLEPc EPSGD: too much time in single iteration
In-Reply-To: <4C029833-EC0F-494F-911F-D795375718D9@gmail.com>
References: <4C029833-EC0F-494F-911F-D795375718D9@gmail.com>
Message-ID: <FD844D29-1CB7-4E1D-AEC6-80F3DAB9F670@dsic.upv.es>

Convergence depends on distribution of eigenvalues you want to compute. On the other hand, the cost also depends on the time it takes to build the preconditioner. Use -log_view to see the cost of the different steps of the computation.

Jose


> El 3 jun 2022, a las 18:50, jsfaraway <jsfaraway at gmail.com> escribi?:
> 
> hello!
> 
> I am trying to use epsgd compute matrix's one smallest eigenvalue. And I find a strang thing. There  are two matrix A(900000*900000) and B(90000*90000).  While solve A use 371 iterations and only 30.83s, solve B use 22 iterations and 38885s! What could be the reason for this? Or what can I do to find the reason?
> 
> I use" -eps_type gd -eps_ncv 300  -eps_nev 3 -eps_smallest_real ". 
> And there is one difference I can tell is matrix B has many small value, whose absolute value  is less than 10-6. Could this be the reason?
> 
> Thank you!
> 
> Runfeng Jin 


From mi.mike1021 at gmail.com  Fri Jun  3 12:39:43 2022
From: mi.mike1021 at gmail.com (Mike Michell)
Date: Fri, 3 Jun 2022 12:39:43 -0500
Subject: [petsc-users] PetscSF Object on Distributed DMPlex for Halo
 Data Exchange
In-Reply-To: <CAMYG4Gkt4hO6hPZhr+32R69Xm=YNeHLPR6XyU1ZKy1NCXnk4aQ@mail.gmail.com>
References: <CAEc7osZXisksXeDLYC-UDT+ZzFU7_bUVOsANds4agh0JJHfLfg@mail.gmail.com>
	<CAC2U1y_TgJquoEMdzx93VAiFfx7cXxo_WZfVWuia9y_4gJszgw@mail.gmail.com>
	<CAEc7osaQPz9Uz9JmqX1xr9KOL6oNatJzT2xLB6iUL66Aek15LA@mail.gmail.com>
	<CAMYG4G=Q9rhPavooDL-_+WViBtaH4Z4pVpqY9S2VBiv8orduxA@mail.gmail.com>
	<CAEc7osadsqTeoKoJ4RYorAZuk5yoWsdWJaoVJBGcvjFq6JPeAg@mail.gmail.com>
	<CAC2U1y8G0PCwWUAmwmOePbvgk27dBky44DFP+mqwqSLJQcdBgQ@mail.gmail.com>
	<CAEc7osaq+3cwuYPSzbAkHPOdE=h8gT-LmVOMTrX6xvR+V1y6ww@mail.gmail.com>
	<CAMYG4G=8Zx_SCYDP+qm85ufPriiKJhoSXRuYeELJRJ_aCJZghQ@mail.gmail.com>
	<CAMYG4Gkdd5VQuXce6702md5cqJ9Ghoh3ip=_CjKXjhP0HMfRMQ@mail.gmail.com>
	<CAEc7osZT96hoQo9kgrBH31SZmVugyZtpf=uG3YVFty1gsonHnQ@mail.gmail.com>
	<CAMYG4GnJ4Ru7H6BxGEYRvML=qE305yMgz+xBRw7nRvgYx2_kyw@mail.gmail.com>
	<CAEc7osaEn=0mui+zb_7gW-nuxww5x9A=vPmPppRwCw6ygKYYXA@mail.gmail.com>
	<CAMYG4Gkt4hO6hPZhr+32R69Xm=YNeHLPR6XyU1ZKy1NCXnk4aQ@mail.gmail.com>
Message-ID: <CAEc7osbmr21P6pebQUhn+EoccebJwdg3-=Zo2NUfWaQGuYULbg@mail.gmail.com>

Thanks for the effort. By any chance, is there any rough timeline for that
part can be done?

Thanks,
Mike


> On Tue, May 31, 2022 at 10:26 AM Mike Michell <mi.mike1021 at gmail.com>
> wrote:
>
>> Thank you. But, it seems that PetscSFCreateSectionSF() also requires
>> petscsf.h file. Which header file I should include to call
>> PetscSFCreateSectionSF() from Fortran?
>>
>
> I will have to write a binding. I will send you the MR when I finish.
>
>   THanks,
>
>     Matt
>
>
>> Thanks,
>> Mike
>>
>>
>> On Tue, May 31, 2022 at 10:04 AM Mike Michell <mi.mike1021 at gmail.com>
>>> wrote:
>>>
>>>> As a follow-up question on your example, is it possible to call
>>>> PetscSFCreateRemoteOffsets() from Fortran?
>>>>
>>>> My code is written in .F90 and in "petsc/finclude/" there is no
>>>> petscsf.h so that the code currently cannot find
>>>> PetscSFCreateRemoteOffsets().
>>>>
>>>
>>> I believe if you pass in NULL for remoteOffsets, that function will be
>>> called internally.
>>>
>>>   Thanks,
>>>
>>>      Matt
>>>
>>>
>>>> Thanks,
>>>> Mike
>>>>
>>>>
>>>> I will also point out that Toby has created a nice example showing how
>>>>> to create an SF for halo exchange between local vectors.
>>>>>
>>>>>   https://gitlab.com/petsc/petsc/-/merge_requests/5267
>>>>>
>>>>>   Thanks,
>>>>>
>>>>>      Matt
>>>>>
>>>>> On Sun, May 22, 2022 at 9:47 PM Matthew Knepley <knepley at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> On Sun, May 22, 2022 at 4:28 PM Mike Michell <mi.mike1021 at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks for the reply. The diagram makes sense and is helpful for
>>>>>>> understanding 1D representation.
>>>>>>>
>>>>>>> However, something is still unclear. From your diagram, the number
>>>>>>> of roots per process seems to vary according to run arguments, such as
>>>>>>> "-dm_distribute_overlap", because "the number of roots for a DMPlex is the
>>>>>>> number of mesh points in the local portion of the mesh (cited from your
>>>>>>> answer to my question (1))" will end up change according to that argument.
>>>>>>> However, from my mock-up code, number of roots is independent to
>>>>>>> -dm_distribute_overlap argument. The summation of "number of roots" through
>>>>>>> processes was always equal to number of physical vertex on my mesh, if I
>>>>>>> define the section layout on vertex with 1DOF. But in your diagram example,
>>>>>>> the summation of "nroots" is larger than the actual number of mesh points,
>>>>>>> which is 13.
>>>>>>>
>>>>>>
>>>>>> I do not understand your question. Notice the -dm_distribute_overlap
>>>>>> does _not_ change the owned points for any process. It only puts in new
>>>>>> leaves, so it also never
>>>>>> changes the roots for this way of using the SF.
>>>>>>
>>>>>>
>>>>>>> Also, it is still unclear how to get the size of "roots" from the
>>>>>>> PetscSection & PetscSF on distributed DMPlex?
>>>>>>>
>>>>>>
>>>>>> For an SF mapping ghost dofs in a global vector, the number of roots
>>>>>> is just the size of the local portion of the vector.
>>>>>>
>>>>>>
>>>>>>> In your diagram, how can you tell your code and make it allocate the
>>>>>>> "nroots=7 for P0, nroots=9 for P1, and nroots=7 for P2" arrays before you
>>>>>>> call PetscSFBcastBegin/End()? It seems that we need to define arrays having
>>>>>>> the size of nroots & nleaves before calling PetscSFBcastBegin/End().
>>>>>>>
>>>>>>
>>>>>> I just want to note that this usage is different from the canonical
>>>>>> usage in Plex. It is fine to do this, but this will not match what I do in
>>>>>> the library if you look.
>>>>>> In Plex, I distinguish two linear spaces:
>>>>>>
>>>>>>   1) Global space: This is the vector space for the solvers. Each
>>>>>> point is uniquely represented and owned by some process
>>>>>>
>>>>>>   2) Local space: This is the vector space for assembly. Some points
>>>>>> are represented multiple times.
>>>>>>
>>>>>> I create an SF that maps from the global space (roots) to the local
>>>>>> space (leaves), and it is called in DMGlobalToLocal() (and
>>>>>> associated functions). This
>>>>>> is more natural in FEM. You seem to want an SF that maps between
>>>>>> global vectors. This will also work. The roots would be the local dofs, and
>>>>>> the leaves
>>>>>> would be shared dofs.
>>>>>>
>>>>>>   Does this make sense?
>>>>>>
>>>>>>      Thanks,
>>>>>>
>>>>>>        Matt
>>>>>>
>>>>>>
>>>>>>> Thanks,
>>>>>>> Mike
>>>>>>>
>>>>>>> Here's a diagram of a 1D mesh with overlap and 3 partitions, showing
>>>>>>>> what the petscsf data is for each.  The number of roots is the number of
>>>>>>>> mesh points in the local representation, and the number of leaves is the
>>>>>>>> number of mesh points that are duplicates of mesh points on other
>>>>>>>> processes.  With that in mind, answering your questions
>>>>>>>>
>>>>>>>> > (1) It seems that the "roots" means the number of vertex not
>>>>>>>> considering overlap layer, and "leaves" seems the number of distributed
>>>>>>>> vertex for each processor that includes overlap layer. Can you acknowledge
>>>>>>>> that this is correct understanding? I have tried to find clearer examples
>>>>>>>> from PETSc team's articles relevant to Star Forest, but I am still unclear
>>>>>>>> about the exact relation & graphical notation of roots & leaves in SF if
>>>>>>>> it's the case of DMPlex solution arrays.
>>>>>>>>
>>>>>>>> No, the number of roots for a DMPlex is the number of mesh points
>>>>>>>> in the local portion of the mesh
>>>>>>>>
>>>>>>>> > (2) If it is so, there is an issue that I cannot define "root
>>>>>>>> data" and "leave data" generally. I am trying to following
>>>>>>>> "src/vec/is/sf/tutorials/ex1f.F90", however, in that example, size of roots
>>>>>>>> and leaves are predefined as 6. How can I generalize that? Because I can
>>>>>>>> get size of leaves using DAG depth(or height), which is equal to number of
>>>>>>>> vertices each proc has. But, how can I get the size of my "roots" region
>>>>>>>> from SF? Any example about that? This question is connected to how can I
>>>>>>>> define "rootdata" for "PetscSFBcastBegin/End()".
>>>>>>>>
>>>>>>>> Does the diagram help you generalize?
>>>>>>>>
>>>>>>>> > (3) More importantly, with the attached PetscSection & SF layout,
>>>>>>>> my vector is only resolved for the size equal to "number of roots" for each
>>>>>>>> proc, but not for the overlapping area(i.e., "leaves"). What I wish to do
>>>>>>>> is to exchange (or reduce) the solution data between each proc, in the
>>>>>>>> overlapping region. Can I get some advices why my vector does not encompass
>>>>>>>> the "leaves" regime? Is there any example doing similar things?
>>>>>>>> Going back to my first response: if you use a section to say how
>>>>>>>> many pieces of data are associated with each local mesh point, then a
>>>>>>>> PetscSF is constructed that requires no more manipulation from you.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sun, May 22, 2022 at 10:47 AM Mike Michell <
>>>>>>>> mi.mike1021 at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Thank you for the reply.
>>>>>>>>> The PetscSection and PetscSF objects are defined as in the
>>>>>>>>> attached mock-up code (Q_PetscSF_1.tar). 1-DOF is defined on vertex as my
>>>>>>>>> solution is determined on each vertex with 1-DOF from a finite-volume
>>>>>>>>> method.
>>>>>>>>>
>>>>>>>>> As follow up questions:
>>>>>>>>> (1) It seems that the "roots" means the number of vertex not
>>>>>>>>> considering overlap layer, and "leaves" seems the number of distributed
>>>>>>>>> vertex for each processor that includes overlap layer. Can you acknowledge
>>>>>>>>> that this is correct understanding? I have tried to find clearer examples
>>>>>>>>> from PETSc team's articles relevant to Star Forest, but I am still unclear
>>>>>>>>> about the exact relation & graphical notation of roots & leaves in SF if
>>>>>>>>> it's the case of DMPlex solution arrays.
>>>>>>>>>
>>>>>>>>> (2) If it is so, there is an issue that I cannot define "root
>>>>>>>>> data" and "leave data" generally. I am trying to following
>>>>>>>>> "src/vec/is/sf/tutorials/ex1f.F90", however, in that example, size of roots
>>>>>>>>> and leaves are predefined as 6. How can I generalize that? Because I can
>>>>>>>>> get size of leaves using DAG depth(or height), which is equal to number of
>>>>>>>>> vertices each proc has. But, how can I get the size of my "roots" region
>>>>>>>>> from SF? Any example about that? This question is connected to how can I
>>>>>>>>> define "rootdata" for "PetscSFBcastBegin/End()".
>>>>>>>>>
>>>>>>>>> (3) More importantly, with the attached PetscSection & SF layout,
>>>>>>>>> my vector is only resolved for the size equal to "number of roots" for each
>>>>>>>>> proc, but not for the overlapping area(i.e., "leaves"). What I wish to do
>>>>>>>>> is to exchange (or reduce) the solution data between each proc, in the
>>>>>>>>> overlapping region. Can I get some advices why my vector does not encompass
>>>>>>>>> the "leaves" regime? Is there any example doing similar things?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Mike
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> On Fri, May 20, 2022 at 4:45 PM Mike Michell <
>>>>>>>>>> mi.mike1021 at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Thanks for the reply.
>>>>>>>>>>>
>>>>>>>>>>> > "What I want to do is to exchange data (probably just
>>>>>>>>>>> MPI_Reduce)" which confuses me, because halo exchange is a point-to-point
>>>>>>>>>>> exchange and not a reduction.  Can you clarify?
>>>>>>>>>>> PetscSFReduceBegin/End seems to be the function that do
>>>>>>>>>>> reduction for PetscSF object. What I intended to mention was either
>>>>>>>>>>> reduction or exchange, not specifically intended "reduction".
>>>>>>>>>>>
>>>>>>>>>>> As a follow-up question:
>>>>>>>>>>> Assuming that the code has its own local solution arrays (not
>>>>>>>>>>> Petsc type), and if the plex's DAG indices belong to the halo region are
>>>>>>>>>>> the only information that I want to know (not the detailed section
>>>>>>>>>>> description, such as degree of freedom on vertex, cells, etc.). I have
>>>>>>>>>>> another PetscSection for printing out my solution.
>>>>>>>>>>> Also if I can convert that DAG indices into my local cell/vertex
>>>>>>>>>>> index, can I just use the PetscSF object created from DMGetPointSF(),
>>>>>>>>>>> instead of "creating PetscSection + DMGetSectionSF()"? In other words, can
>>>>>>>>>>> I use the PetscSF object declared from DMGetPointSF() for the halo
>>>>>>>>>>> communication?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> No, because that point SF will index information by point number.
>>>>>>>>>> You would need to build a new SF that indexes your dofs. The steps you would
>>>>>>>>>> go through are exactly the same as you would if you just told us
>>>>>>>>>> what the Section is that indexes your data.
>>>>>>>>>>
>>>>>>>>>>   Thanks,
>>>>>>>>>>
>>>>>>>>>>      Matt
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Mike
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> The PetscSF that is created automatically is the "point sf" (
>>>>>>>>>>>> https://petsc.org/main/docs/manualpages/DM/DMGetPointSF/): it
>>>>>>>>>>>> says which mesh points (cells, faces, edges and vertices) are duplicates of
>>>>>>>>>>>> others.
>>>>>>>>>>>>
>>>>>>>>>>>> In a finite volume application we typically want to assign
>>>>>>>>>>>> degrees of freedom just to cells: some applications may only have one
>>>>>>>>>>>> degree of freedom, others may have multiple.
>>>>>>>>>>>>
>>>>>>>>>>>> You encode where you want degrees of freedom in a PetscSection
>>>>>>>>>>>> and set that as the section for the DM in DMSetLocalSection() (
>>>>>>>>>>>> https://petsc.org/release/docs/manualpages/DM/DMSetLocalSection.html
>>>>>>>>>>>> )
>>>>>>>>>>>>
>>>>>>>>>>>> (A c example of these steps that sets degrees of freedom for
>>>>>>>>>>>> *vertices* instead of cells is `src/dm/impls/plex/tutorials/ex7.c`)
>>>>>>>>>>>>
>>>>>>>>>>>> After that you can call DMGetSectionSF() (
>>>>>>>>>>>> https://petsc.org/main/docs/manualpages/DM/DMGetSectionSF/) to
>>>>>>>>>>>> the the PetscSF that you want for halo exchange: the one for your solution
>>>>>>>>>>>> variables.
>>>>>>>>>>>>
>>>>>>>>>>>> After that, the only calls you typically need in a finite
>>>>>>>>>>>> volume code is PetscSFBcastBegin() to start a halo exchange and
>>>>>>>>>>>> PetscSFBcastEnd() to complete it.
>>>>>>>>>>>>
>>>>>>>>>>>> You say
>>>>>>>>>>>>
>>>>>>>>>>>> > What I want to do is to exchange data (probably just
>>>>>>>>>>>> MPI_Reduce)
>>>>>>>>>>>>
>>>>>>>>>>>> which confuses me, because halo exchange is a point-to-point
>>>>>>>>>>>> exchange and not a reduction.  Can you clarify?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, May 20, 2022 at 8:35 PM Mike Michell <
>>>>>>>>>>>> mi.mike1021 at gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Dear PETSc developer team,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi, I am using DMPlex for a finite-volume code and trying to
>>>>>>>>>>>>> understand the usage of PetscSF. What is a typical procedure for doing halo
>>>>>>>>>>>>> data exchange at parallel boundary using PetscSF object on DMPlex? Is there
>>>>>>>>>>>>> any example that I can refer to usage of PetscSF with distributed DMPlex?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Assuming to use the attached mock-up code and mesh, if I give
>>>>>>>>>>>>> "-dm_distribute_overlap 1 -over_dm_view" to run the code, I can see a
>>>>>>>>>>>>> PetscSF object is already created, although I have not called
>>>>>>>>>>>>> "PetscSFCreate" in the code. How can I import & use that PetscSF already
>>>>>>>>>>>>> created by the code to do the halo data exchange?
>>>>>>>>>>>>>
>>>>>>>>>>>>> What I want to do is to exchange data (probably just
>>>>>>>>>>>>> MPI_Reduce) in a parallel boundary region using PetscSF and its functions.
>>>>>>>>>>>>> I might need to have an overlapping layer or not.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Mike
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> What most experimenters take for granted before they begin their
>>>>>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>>>>>> experiments lead.
>>>>>>>>>> -- Norbert Wiener
>>>>>>>>>>
>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>>>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>>> --
>>>>>> What most experimenters take for granted before they begin their
>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>> experiments lead.
>>>>>> -- Norbert Wiener
>>>>>>
>>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> What most experimenters take for granted before they begin their
>>>>> experiments is infinitely more interesting than any results to which their
>>>>> experiments lead.
>>>>> -- Norbert Wiener
>>>>>
>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>
>>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://www.cse.buffalo.edu/~knepley/
>>> <http://www.cse.buffalo.edu/~knepley/>
>>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220603/4f3ec0e5/attachment-0001.html>

From lidia.varsh at mail.ioffe.ru  Mon Jun  6 06:19:37 2022
From: lidia.varsh at mail.ioffe.ru (Lidia)
Date: Mon, 6 Jun 2022 14:19:37 +0300
Subject: [petsc-users] Sparse linear system solving
In-Reply-To: <CAMYG4Gm+BCLxfyL4Q22zSgsxLbZvWbu7LiaQqui8sNhU4rofOg@mail.gmail.com>
References: <026d55af-c978-81da-f571-46519e5e6f8e@mail.ioffe.ru>
	<CAMYG4Gn=ngVrYCyjgkBgzWO+3eVFhwJ6xhqb_XWhAocbaWM_pA@mail.gmail.com>
	<CADOhEh6uO9-iH6K0Wy-0ZeJv4TwMkYauFfqB-ceiMtP3PgvvHg@mail.gmail.com>
	<2e7ebbf1-511a-7055-ff92-131d3bf73f1e@mail.ioffe.ru>
	<CAMYG4Gk8V6AJdQzjcwX-t_TVf5cSgSs6EQVAZXaQZ8GSsr7iXQ@mail.gmail.com>
	<CAMYG4G=2+46y86OfYccv4UJV991gyOJndaZC_7UuZM-aanDnOA@mail.gmail.com>
	<201b6c28-2616-edaa-21dd-ee91a257c432@mail.ioffe.ru>
	<CAMYG4G=mrfv=sm9Ux5kvKZ9XvoWn4K-Ubm-N3mc3pUfFdQt5_Q@mail.gmail.com>
	<5475809d-16b1-e6a8-72b8-c4605f321e8e@mail.ioffe.ru>
	<CAMYG4Gm+BCLxfyL4Q22zSgsxLbZvWbu7LiaQqui8sNhU4rofOg@mail.gmail.com>
Message-ID: <7db517a0-e541-fb1b-b4b7-9063499bc939@mail.ioffe.ru>

Dear colleagues,

Thank you much for the help!

Now the code seems to be working well!

Best,
Lidiia

On 03.06.2022 15:19, Matthew Knepley wrote:
> On Fri, Jun 3, 2022 at 6:42 AM Lidia <lidia.varsh at mail.ioffe.ru> wrote:
>
>     Dear Matt, Barry,
>
>     thank you for the information about openMP!
>
>     Now all processes are loaded well. But we see a strange behaviour
>     of running times at different iterations, see description below.
>     Could you please explain us the reason and how we can improve it?
>
>     We need to quickly solve a big (about 1e6 rows) square sparse
>     non-symmetric matrix many times (about 1e5 times) consequently.
>     Matrix is constant at every iteration, and the right-side vector B
>     is slowly changed (we think that its change at every iteration
>     should be less then 0.001 %). So we use every previous solution
>     vector X as an initial guess for the next iteration. AMG
>     preconditioner and GMRES solver are used.
>
>     We have tested the code using a matrix with 631 000 rows, during
>     15 consequent iterations, using vector X from the previous
>     iterations. Right-side vector B and matrix A are constant during
>     the whole running. The time of the first iteration is large (about
>     2 seconds) and is quickly decreased to the next iterations
>     (average time of last iterations were about 0.00008 s). But some
>     iterations in the middle (# 2 and # 12) have huge time - 0.999063
>     second (see the figure with time dynamics attached). This time of
>     0.999 second does not depend on the size of a matrix, on the
>     number of MPI processes, these time jumps also exist if we vary
>     vector B. Why these time jumps appear and how we can avoid them?
>
>
> PETSc is not taking this time. It must come from somewhere else in 
> your code. Notice that no iterations are taken for any subsequent 
> solves, so no operations other than the residual norm check (and 
> preconditioner application) are being performed.
>
> ? Thanks,
>
> ? ? ?Matt
>
>     The ksp_monitor out for this running (included 15 iterations)
>     using 36 MPI processes and a file with the memory bandwidth
>     information (testSpeed) are also attached. We can provide our C++
>     script if it is needed.
>
>     Thanks a lot!
>
>     Best,
>     Lidiia
>
>
>
>     On 01.06.2022 21:14, Matthew Knepley wrote:
>>     On Wed, Jun 1, 2022 at 1:43 PM Lidia <lidia.varsh at mail.ioffe.ru>
>>     wrote:
>>
>>         Dear Matt,
>>
>>         Thank you for the rule of 10,000 variables per process! We
>>         have run ex.5 with matrix 1e4 x 1e4 at our cluster and got a
>>         good performance dynamics (see the figure "performance.png" -
>>         dependency of the solving time in seconds on the number of
>>         cores). We have used GAMG preconditioner (multithread: we
>>         have added the option
>>         "-pc_gamg_use_parallel_coarse_grid_solver") and GMRES solver.
>>         And we have set one openMP thread to every MPI process. Now
>>         the ex.5 is working good on many mpi processes! But the
>>         running uses about 100 GB of RAM.
>>
>>         How we can run ex.5 using many openMP threads without mpi? If
>>         we just change the running command, the cores are not loaded
>>         normally: usually just one core is loaded in 100 % and others
>>         are idle. Sometimes all cores are working in 100 % during 1
>>         second but then again become idle about 30 seconds. Can the
>>         preconditioner use many threads and how to activate this option?
>>
>>
>>     Maye you could describe what you are trying to accomplish?
>>     Threads and processes are not really different, except for memory
>>     sharing. However, sharing large complex data structures rarely
>>     works. That is why they get partitioned and operate effectively
>>     as distributed memory. You would not really save memory by using
>>     threads in this instance, if that is your goal. This is detailed
>>     in the talks in this session (see 2016 PP Minisymposium on this
>>     page https://cse.buffalo.edu/~knepley/relacs.html).
>>
>>     ? Thanks,
>>
>>     ? ? ?Matt
>>
>>         The solving times (the time of the solver work) using 60
>>         openMP threads is 511 seconds now, and while using 60 MPI
>>         processes - 13.19 seconds.
>>
>>         ksp_monitor outs for both cases (many openMP threads or many
>>         MPI processes) are attached.
>>
>>
>>         Thank you!
>>
>>         Best,
>>         Lidia
>>
>>         On 31.05.2022 15:21, Matthew Knepley wrote:
>>>         I have looked at the local logs. First, you have run
>>>         problems of size 12? and 24. As a rule of thumb, you need
>>>         10,000
>>>         variables per process in order to see good speedup.
>>>
>>>         ? Thanks,
>>>
>>>         ? ? ?Matt
>>>
>>>         On Tue, May 31, 2022 at 8:19 AM Matthew Knepley
>>>         <knepley at gmail.com> wrote:
>>>
>>>             On Tue, May 31, 2022 at 7:39 AM Lidia
>>>             <lidia.varsh at mail.ioffe.ru> wrote:
>>>
>>>                 Matt, Mark, thank you much for your answers!
>>>
>>>
>>>                 Now we have run example # 5 on our computer cluster
>>>                 and on the local server and also have not seen any
>>>                 performance increase, but by unclear reason running
>>>                 times on the local server are much better than on
>>>                 the cluster.
>>>
>>>             I suspect that you are trying to get speedup without
>>>             increasing the memory bandwidth:
>>>
>>>             https://petsc.org/main/faq/#what-kind-of-parallel-computers-or-clusters-are-needed-to-use-petsc-or-why-do-i-get-little-speedup
>>>
>>>             ? Thanks,
>>>
>>>             ? ? ?Matt
>>>
>>>                 Now we will try to run petsc #5 example inside a
>>>                 docker container on our server and see if the
>>>                 problem is in our environment. I'll write you the
>>>                 results of this test as soon as we get it.
>>>
>>>                 The ksp_monitor outs for the 5th test at the current
>>>                 local server configuration (for 2 and 4 mpi
>>>                 processes) and for the cluster (for 1 and 3 mpi
>>>                 processes) are attached .
>>>
>>>
>>>                 And one more question. Potentially we can use 10
>>>                 nodes and 96 threads at each node on our cluster.
>>>                 What do you think, which combination of numbers of
>>>                 mpi processes and openmp threads may be the best for
>>>                 the 5th example?
>>>
>>>                 Thank you!
>>>
>>>
>>>                 Best,
>>>                 Lidiia
>>>
>>>                 On 31.05.2022 05:42, Mark Adams wrote:
>>>>                 And if you see "NO" change in performance I suspect
>>>>                 the solver/matrix is all on one processor.
>>>>                 (PETSc does not use threads by default so threads
>>>>                 should not change anything).
>>>>
>>>>                 As Matt said, it is best to start with a PETSc
>>>>                 example?that does something like what you want
>>>>                 (parallel linear solve, see src/ksp/ksp/tutorials
>>>>                 for examples), and then add your code to it.
>>>>                 That way you get the basic infrastructure?in place
>>>>                 for you, which is pretty obscure to the uninitiated.
>>>>
>>>>                 Mark
>>>>
>>>>                 On Mon, May 30, 2022 at 10:18 PM Matthew Knepley
>>>>                 <knepley at gmail.com> wrote:
>>>>
>>>>                     On Mon, May 30, 2022 at 10:12 PM Lidia
>>>>                     <lidia.varsh at mail.ioffe.ru> wrote:
>>>>
>>>>                         Dear colleagues,
>>>>
>>>>                         Is here anyone who have solved big sparse
>>>>                         linear matrices using PETSC?
>>>>
>>>>
>>>>                     There are lots of publications with this kind
>>>>                     of data. Here is one recent one:
>>>>                     https://arxiv.org/abs/2204.01722
>>>>
>>>>                         We have found NO performance improvement
>>>>                         while using more and more mpi
>>>>                         processes (1-2-3) and open-mp threads (from
>>>>                         1 to 72 threads). Did anyone
>>>>                         faced to this problem? Does anyone know any
>>>>                         possible reasons of such
>>>>                         behaviour?
>>>>
>>>>
>>>>                     Solver behavior is dependent on the input
>>>>                     matrix. The only general-purpose solvers
>>>>                     are direct, but they do not scale linearly and
>>>>                     have high memory requirements.
>>>>
>>>>                     Thus, in order to make progress you will have
>>>>                     to be specific about your matrices.
>>>>
>>>>                         We use AMG preconditioner and GMRES solver
>>>>                         from KSP package, as our
>>>>                         matrix is large (from 100 000 to 1e+6 rows
>>>>                         and columns), sparse,
>>>>                         non-symmetric and includes both positive
>>>>                         and negative values. But
>>>>                         performance problems also exist while using
>>>>                         CG solvers with symmetric
>>>>                         matrices.
>>>>
>>>>
>>>>                     There are many PETSc examples, such as example
>>>>                     5 for the Laplacian, that exhibit
>>>>                     good scaling with both AMG and GMG.
>>>>
>>>>                         Could anyone help us to set appropriate
>>>>                         options of the preconditioner
>>>>                         and solver? Now we use default parameters,
>>>>                         maybe they are not the best,
>>>>                         but we do not know a good combination. Or
>>>>                         maybe you could suggest any
>>>>                         other pairs of preconditioner+solver for
>>>>                         such tasks?
>>>>
>>>>                         I can provide more information: the
>>>>                         matrices that we solve, c++ script
>>>>                         to run solving using petsc and any
>>>>                         statistics obtained by our runs.
>>>>
>>>>
>>>>                     First, please provide a description of the
>>>>                     linear system, and the output of
>>>>
>>>>                     ? -ksp_view -ksp_monitor_true_residual
>>>>                     -ksp_converged_reason -log_view
>>>>
>>>>                     for each test case.
>>>>
>>>>                     ? Thanks,
>>>>
>>>>                     ? ? ?Matt
>>>>
>>>>                         Thank you in advance!
>>>>
>>>>                         Best regards,
>>>>                         Lidiia Varshavchik,
>>>>                         Ioffe Institute, St. Petersburg, Russia
>>>>
>>>>
>>>>
>>>>                     -- 
>>>>                     What most experimenters take for granted before
>>>>                     they begin their experiments is infinitely more
>>>>                     interesting than any results to which their
>>>>                     experiments lead.
>>>>                     -- Norbert Wiener
>>>>
>>>>                     https://www.cse.buffalo.edu/~knepley/
>>>>                     <http://www.cse.buffalo.edu/~knepley/>
>>>>
>>>
>>>
>>>             -- 
>>>             What most experimenters take for granted before they
>>>             begin their experiments is infinitely more interesting
>>>             than any results to which their experiments lead.
>>>             -- Norbert Wiener
>>>
>>>             https://www.cse.buffalo.edu/~knepley/
>>>             <http://www.cse.buffalo.edu/~knepley/>
>>>
>>>
>>>
>>>         -- 
>>>         What most experimenters take for granted before they begin
>>>         their experiments is infinitely more interesting than any
>>>         results to which their experiments lead.
>>>         -- Norbert Wiener
>>>
>>>         https://www.cse.buffalo.edu/~knepley/
>>>         <http://www.cse.buffalo.edu/~knepley/>
>>
>>
>>
>>     -- 
>>     What most experimenters take for granted before they begin their
>>     experiments is infinitely more interesting than any results to
>>     which their experiments lead.
>>     -- Norbert Wiener
>>
>>     https://www.cse.buffalo.edu/~knepley/
>>     <http://www.cse.buffalo.edu/~knepley/>
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/ 
> <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220606/be25165e/attachment-0001.html>

From jroman at dsic.upv.es  Tue Jun  7 05:37:07 2022
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Tue, 7 Jun 2022 12:37:07 +0200
Subject: [petsc-users] Accelerating eigenvalue computation / removing
 portion of spectrum
In-Reply-To: <B06CCFB4-4ADE-43F9-BB0A-254FD620D16E@dsic.upv.es>
References: <YTBPR01MB24485BBC7AC270152E1FF4FEB1DC9@YTBPR01MB2448.CANPRD01.PROD.OUTLOOK.COM>
	<7E80B1DF-0F06-4EFB-99BA-63471F55165D@dsic.upv.es>
	<YTBPR01MB2448D0D92CAA4115D5F210FDB1DC9@YTBPR01MB2448.CANPRD01.PROD.OUTLOOK.COM>
	<62743559-BA54-4828-B0D8-B84111C2E1EA@dsic.upv.es>
	<YTBPR01MB24486A20B031A37AE5236555B1DC9@YTBPR01MB2448.CANPRD01.PROD.OUTLOOK.COM>
	<B06CCFB4-4ADE-43F9-BB0A-254FD620D16E@dsic.upv.es>
Message-ID: <EFF82022-51FC-471A-9E39-17B5F262A0C2@dsic.upv.es>

Lucas,

I have tried your matrices. Below are some results with complex scalars and MUMPS using 2 MPI processes.

Using shift-and-invert with eps_target=-0.95 I get three eigenvalues (two of them equal to -1), and MUMPS is taking 65 seconds out of 67. Convergence is very fast, nothing can be improved because most of the time is due to MUMPS.

Adding a region filter with -rg_type interval -rg_interval_endpoints -.99,1,-.1,.1  the times are essentially the same, but you get rid of the unwanted eigenvalues (-1). This is the best option.

If you need to compute many eigenvalues, then you should consider specifying an interval (spectrum slicing), see section 3.4.5 of the manual. But this cannot be used with complex scalars with MUMPS (see the note in the manual). Since your matrices are real and symmetric, I tried it with real scalars, using -eps_interval -.99,1 and in that case I get 33 eigenvalues and MUMPS takes 33 seconds out of the overall 68 seconds (three numerical factorizations are done in this execution).

$ mpiexec -n 2 ./ex7 -f1 Areal.mat -f2 Breal.mat -eps_gen_hermitian -st_type sinvert -st_pc_type cholesky -eps_interval -.99,1 -st_mat_mumps_icntl_13 1 -st_mat_mumps_icntl_24 1 

Generalized eigenproblem stored in file.

 Reading REAL matrices from binary files...
 Number of iterations of the method: 3
 Number of linear iterations of the method: 0
 Solution method: krylovschur

 Number of requested eigenvalues: 33
 Stopping condition: tol=1e-10, maxit=100
 Linear eigensolve converged (33 eigenpairs) due to CONVERGED_TOL; iterations 3
 ---------------------- --------------------
            k             ||Ax-kBx||/||kx||
 ---------------------- --------------------
       -0.698786            4.61016e-14
       -0.598058            5.34239e-14
       -0.598051            5.53609e-14
       -0.380951            7.83403e-14
       -0.280707            2.91772e-13
       -0.280671            3.86414e-13
       -0.273832            2.18507e-13
       -0.273792            2.25672e-13
       -0.064625            2.71132e-12
       -0.064558            2.74757e-12
       -0.034888            4.02325e-12
        0.138192            1.56285e-12
        0.138298            3.58149e-12
        0.197123            1.77274e-12
        0.197391            1.93185e-12
        0.268338            1.09276e-12
        0.268416            8.24014e-13
        0.363498            9.21471e-13
        0.420608            7.18076e-13
        0.420669            5.13068e-13
        0.523661            1.28491e-12
        0.621233            1.07663e-12
        0.621648            5.91783e-13
        0.662408            4.36285e-13
        0.662578            5.11942e-13
        0.708328            3.94862e-13
        0.708488            3.56613e-13
        0.709269            2.73414e-13
        0.733286            5.73269e-13
        0.733524            4.52308e-13
        0.814093             2.5299e-13
        0.870087            2.02513e-13
        0.870229            3.19166e-13
 ---------------------- --------------------



> El 31 may 2022, a las 22:28, Jose E. Roman <jroman at dsic.upv.es> escribi?:
> 
> Probably MUMPS is taking most of the time...
> 
> If the matrices are not too large, send them to my personal email and I will have a look.
> 
> Jose
> 
> 
>> El 31 may 2022, a las 22:13, Lucas Banting <bantingl at myumanitoba.ca> escribi?:
>> 
>> Thanks for the sharing the article. 
>> For my application, I think using an interval region to exclude the unneeded eigenvalues will still be faster than forming a larger constrained system. Specifying an interval appears to run in a similar amount of time.
>> 
>> Lucas
>> From: Jose E. Roman <jroman at dsic.upv.es>
>> Sent: Tuesday, May 31, 2022 2:08 PM
>> To: Lucas Banting <bantingl at myumanitoba.ca>
>> Cc: PETSc <petsc-users at mcs.anl.gov>
>> Subject: Re: [petsc-users] Accelerating eigenvalue computation / removing portion of spectrum
>> 
>> Caution: This message was sent from outside the University of Manitoba.
>> 
>> 
>> Please respond to the list also.
>> 
>> The problem with EPSSetDeflationSpace() is that it internally orthogonalizes the vectors that you pass in, so it is not viable for thousands of vectors.
>> 
>> You can try implementing any of the alternative schemes described in https://doi.org/10.1002/nla.307
>> 
>> Another thing you can try is to use a region for filtering, as explained in section 2.6.4 of the users manual. Use a region that excludes -1.0 and you will have more chances to get the wanted eigenvalues faster. But still convergence may be slow.
>> 
>> Jose
>> 
>> 
>>> El 31 may 2022, a las 20:52, Lucas Banting <bantingl at myumanitoba.ca> escribi?:
>>> 
>>> Thanks for the response Jose,
>>> 
>>> There is an analytical solution for these modes actually, however there are thousands of them and they are all sparse.
>>> I assume it is a non-trivial thing for EPSSetDeflationSpace() to take something like a MATAIJ as input?
>>> 
>>> Lucas
>>> From: Jose E. Roman <jroman at dsic.upv.es>
>>> Sent: Tuesday, May 31, 2022 1:11 PM
>>> To: Lucas Banting <bantingl at myumanitoba.ca>
>>> Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
>>> Subject: Re: [petsc-users] Accelerating eigenvalue computation / removing portion of spectrum
>>> 
>>> Caution: This message was sent from outside the University of Manitoba.
>>> 
>>> 
>>> If you know how to cheaply compute a basis of the nullspace of S, then you can try passing it to the solver via EPSSetDeflationSpace()https://slepc.upv.es/documentation/current/docs/manualpages/EPS/EPSSetDeflationSpace.html
>>> 
>>> Jose
>>> 
>>> 
>>>> El 31 may 2022, a las 19:28, Lucas Banting <bantingl at myumanitoba.ca> escribi?:
>>>> 
>>>> Hello,
>>>> 
>>>> I have a general non hermitian eigenvalue problem arising from the 3D helmholtz equation.
>>>> The form of the helmholtz equaton is:
>>>> 
>>>> (S - k^2M)v = lambda k^2 M v
>>>> 
>>>> Where S is the stiffness/curl-curl matrix and M is the mass matrix associated with edge elements used to discretize the problem.
>>>> The helmholtz equation creates eigenvalues of -1.0, which I believe are eigenvectors that are part of the null space of the curl-curl operator S.
>>>> 
>>>> For my application, I would like to compute eigenvalues > -1.0, and avoid computation of eigenvalues of -1.0.
>>>> I am currently using shift invert ST with mumps LU direct solver. By increasing the shift away from lambda=-1.0. I get faster computation of eigenvectors, and the lambda=-1.0 eigenvectors appear to slow down the computation by about a factor of two.
>>>> Is there a way to avoid these lambda = -1.0 eigenpairs with a GNHEP problem type?
>>>> 
>>>> Regards,
>>>> Lucas
> 


From yu1299885905 at outlook.com  Tue Jun  7 08:51:32 2022
From: yu1299885905 at outlook.com (wang yuqi)
Date: Tue, 7 Jun 2022 13:51:32 +0000
Subject: [petsc-users] PETSC ERROR: Caught signal number 11 SEGV:
 Segmentation Violation, probably memory access out of range
Message-ID: <MW5PR20MB4524E3770FD333752FF36665DAA59@MW5PR20MB4524.namprd20.prod.outlook.com>

Hi, Dear developer:
I encountered the following problems when I run my code with PETSC-3.5.2:

[46]PETSC ERROR: ------------------------------------------------------------------------
[46]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[46]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[46]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[46]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors

Could you please help me to fix this problem?
Thank you very much!

Best Regards.
Yuqi Wang

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220607/86d99113/attachment.html>

From knepley at gmail.com  Tue Jun  7 09:00:37 2022
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 7 Jun 2022 10:00:37 -0400
Subject: [petsc-users] PETSC ERROR: Caught signal number 11 SEGV:
 Segmentation Violation, probably memory access out of range
In-Reply-To: <MW5PR20MB4524E3770FD333752FF36665DAA59@MW5PR20MB4524.namprd20.prod.outlook.com>
References: <MW5PR20MB4524E3770FD333752FF36665DAA59@MW5PR20MB4524.namprd20.prod.outlook.com>
Message-ID: <CAMYG4GnG_w5o9yKxvQsz9XjdsSv7Ke_ugj67iHOYyPuUN5mM+A@mail.gmail.com>

On Tue, Jun 7, 2022 at 9:51 AM wang yuqi <yu1299885905 at outlook.com> wrote:

> Hi, Dear developer:
>
> I encountered the following problems when I run my code with PETSC-3.5.2:
>
>
>
> [46]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [46]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
>
> [46]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>
> [46]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>
> [46]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
> X to find memory corruption errors
>
>
>
> Could you please help me to fix this problem?
>
>
It may not be a PETSc problem. Could you run in the debugger and get a
stack trace?

  Thanks,

     Matt


> Thank you very much!
>
>
>
> Best Regards.
>
> Yuqi Wang
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220607/6af29dc2/attachment-0001.html>

From bsmith at petsc.dev  Tue Jun  7 10:15:49 2022
From: bsmith at petsc.dev (Barry Smith)
Date: Tue, 7 Jun 2022 11:15:49 -0400
Subject: [petsc-users] PETSC ERROR: Caught signal number 11 SEGV:
 Segmentation Violation, probably memory access out of range
In-Reply-To: <CAMYG4GnG_w5o9yKxvQsz9XjdsSv7Ke_ugj67iHOYyPuUN5mM+A@mail.gmail.com>
References: <MW5PR20MB4524E3770FD333752FF36665DAA59@MW5PR20MB4524.namprd20.prod.outlook.com>
	<CAMYG4GnG_w5o9yKxvQsz9XjdsSv7Ke_ugj67iHOYyPuUN5mM+A@mail.gmail.com>
Message-ID: <7BA74193-101C-4387-B13E-79DEBF52F66F@petsc.dev>


   That is an extremely old PETSc version. Unless you are using a package that only works with that version (and talk to the package's authors about upgrading) we recommend upgrading to the latest PETSc version.

   Usually, there is more information in the error message, is there more in the message you can send?


> On Jun 7, 2022, at 10:00 AM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Tue, Jun 7, 2022 at 9:51 AM wang yuqi <yu1299885905 at outlook.com <mailto:yu1299885905 at outlook.com>> wrote:
> Hi, Dear developer:
> 
> I encountered the following problems when I run my code with PETSC-3.5.2:
> 
>  
> 
> [46]PETSC ERROR: ------------------------------------------------------------------------
> 
> [46]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> 
> [46]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> 
> [46]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind <http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind>
> [46]PETSC ERROR: or try http://valgrind.org <http://valgrind.org/> on GNU/linux and Apple Mac OS X to find memory corruption errors
> 
>  
> 
> Could you please help me to fix this problem?
> 
> 
> 
> It may not be a PETSc problem. Could you run in the debugger and get a stack trace?
> 
>   Thanks,
> 
>      Matt
>  
> Thank you very much!
> 
>  
> 
> Best Regards.
> 
> Yuqi Wang
> 
>  
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220607/a064dd00/attachment.html>

From rthirumalaisam1857 at sdsu.edu  Tue Jun  7 17:51:21 2022
From: rthirumalaisam1857 at sdsu.edu (Ramakrishnan Thirumalaisamy)
Date: Tue, 7 Jun 2022 15:51:21 -0700
Subject: [petsc-users] How to ignore a one floating point exception and move
 to the next?
Message-ID: <CA+NMaCYvSYapYZmZ+XN6BPuY5X3KEyfN1CeV304LKpcvR-bY1A@mail.gmail.com>

Hi everyone,

I am using fp_trap to debug the floating-point error in my code. Is there
any way I can move from one floating point to next one When I run the code
in the debugger with "-fp_trap"? I know that some floating point errors are
due to uninitialized variables but those are benign. I want to move to
those ones that lead to NANs or division by zero.



Thanks,
Rama
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220607/d60cf3b3/attachment.html>

From bsmith at petsc.dev  Tue Jun  7 18:10:15 2022
From: bsmith at petsc.dev (Barry Smith)
Date: Tue, 7 Jun 2022 19:10:15 -0400
Subject: [petsc-users] How to ignore a one floating point exception and
 move to the next?
In-Reply-To: <CA+NMaCYvSYapYZmZ+XN6BPuY5X3KEyfN1CeV304LKpcvR-bY1A@mail.gmail.com>
References: <CA+NMaCYvSYapYZmZ+XN6BPuY5X3KEyfN1CeV304LKpcvR-bY1A@mail.gmail.com>
Message-ID: <7E70E5E5-EBE4-4614-9646-484C6441A619@petsc.dev>


  PETSc uses the signal handler to catch floating point exceptions when run by default or with -fp_trap. These are hard to recover from and continue.

  You can run PETSc with -fp_trap off in the debugger but tell the debugger to catch the floating point exceptions. You may be able to continue from those.

  Not having uninitialized variables and strange unimportant floating point exceptions in your code is part of good housekeeping and means that when you really need to debug you can be much more efficient in the debugging process. Like trying to find something in a messy room or a well organized room. I recommend you first do the housekeeping rather than try to find ways to avoid doing the housekeeping.

  Barry


> On Jun 7, 2022, at 6:51 PM, Ramakrishnan Thirumalaisamy <rthirumalaisam1857 at sdsu.edu> wrote:
> 
> Hi everyone,
> 
> I am using fp_trap to debug the floating-point error in my code. Is there any way I can move from one floating point to next one When I run the code in the debugger with "-fp_trap"? I know that some floating point errors are due to uninitialized variables but those are benign. I want to move to those ones that lead to NANs or division by zero.
> 
> 
> 
> Thanks,
> Rama


From jacob.fai at gmail.com  Tue Jun  7 18:31:13 2022
From: jacob.fai at gmail.com (Jacob Faibussowitsch)
Date: Tue, 7 Jun 2022 19:31:13 -0400
Subject: [petsc-users] How to ignore a one floating point exception and
 move to the next?
In-Reply-To: <7E70E5E5-EBE4-4614-9646-484C6441A619@petsc.dev>
References: <CA+NMaCYvSYapYZmZ+XN6BPuY5X3KEyfN1CeV304LKpcvR-bY1A@mail.gmail.com>
	<7E70E5E5-EBE4-4614-9646-484C6441A619@petsc.dev>
Message-ID: <109E15F6-5F38-4B1F-A4F4-348CBD63C5A6@gmail.com>

You can also compile your code (and PETSc) using `-fsanitize=undefined` and run it to detect such errors. 

Note however that this will most likely also catch/error out on usage of uninitialized variables so your mileage may vary. As Barry notes this kind of stuff is much easier to debug when you don?t have to ignore other errors.

Best regards,

Jacob Faibussowitsch
(Jacob Fai - booss - oh - vitch)

> On Jun 7, 2022, at 19:10, Barry Smith <bsmith at petsc.dev> wrote:
> 
> 
>  PETSc uses the signal handler to catch floating point exceptions when run by default or with -fp_trap. These are hard to recover from and continue.
> 
>  You can run PETSc with -fp_trap off in the debugger but tell the debugger to catch the floating point exceptions. You may be able to continue from those.
> 
>  Not having uninitialized variables and strange unimportant floating point exceptions in your code is part of good housekeeping and means that when you really need to debug you can be much more efficient in the debugging process. Like trying to find something in a messy room or a well organized room. I recommend you first do the housekeeping rather than try to find ways to avoid doing the housekeeping.
> 
>  Barry
> 
> 
>> On Jun 7, 2022, at 6:51 PM, Ramakrishnan Thirumalaisamy <rthirumalaisam1857 at sdsu.edu> wrote:
>> 
>> Hi everyone,
>> 
>> I am using fp_trap to debug the floating-point error in my code. Is there any way I can move from one floating point to next one When I run the code in the debugger with "-fp_trap"? I know that some floating point errors are due to uninitialized variables but those are benign. I want to move to those ones that lead to NANs or division by zero.
>> 
>> 
>> 
>> Thanks,
>> Rama
> 


From mi.mike1021 at gmail.com  Tue Jun  7 23:14:31 2022
From: mi.mike1021 at gmail.com (Mike Michell)
Date: Tue, 7 Jun 2022 23:14:31 -0500
Subject: [petsc-users] Load mesh as DMPlex along with Solution Fields
 obtained from External Codes
Message-ID: <CAEc7osb_k815Ngqq8QAG2i+kehR-4Gd5NBtf8Do2giotHfu=cg@mail.gmail.com>

Dear PETSc developer team,

I am a user of PETSc DMPlex for a finite-volume solver. So far, I have
loaded a mesh file made by Gmsh as a DMPlex object without pre-computed
solution field.
But what if I need to load the mesh as well as solution fields that are
computed by other codes sharing the same physical domain, what is a smart
way to do that? In other words, how can I load a DM object from a mesh file
along with a defined solution field?
I can think of that; load mesh to a DM object first, then declare a local
(or global) vector to read & map the external solution field onto the PETSc
data structure. But I can feel that this might not be the best way.

Thanks,
Mike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220607/129ccc57/attachment.html>

From sami.ben-elhaj-salah at ensma.fr  Wed Jun  8 03:57:18 2022
From: sami.ben-elhaj-salah at ensma.fr (Sami BEN ELHAJ SALAH)
Date: Wed, 8 Jun 2022 10:57:18 +0200
Subject: [petsc-users] Writing VTK output
Message-ID: <D72073D3-62B5-4151-9B11-E3ED40F32B01@ensma.fr>

Dear Petsc Developer team,

I solved a linear elastic problem in 3D using a DMPLEX. My system is converging, then I would like to write out my solution vector to a vtk file where I use unstructured mesh. Currently, I tried two algorithms and I have the same result.

1) Algorithm 1    
err = SNESSolve(_snes, bc_vec_test, solution);
CHKERRABORT(FOX::Parallel::COMM_WORLD,err);
PetscViewer vtk; 
PetscViewerVTKOpen(FOX::Parallel::COMM_WORLD,"solution.vtk",FILE_MODE_WRITE,&vtk); 
VecView(solution,vtk);
PetscViewerDestroy(&vtk);


2) Algorithm 2
err = SNESSolve(_snes, bc_vec_test, solution);
CHKERRABORT(FOX::Parallel::COMM_WORLD,err);
PetscViewer vtk; 
PetscViewerCreate(FOX::Parallel::COMM_WORLD, &vtk); 
PetscViewerSetType(vtk, PETSCVIEWERVTK); 
PetscViewerFileSetName(vtk, "sol.vtk"); 
VecView(solution, vtk); 
PetscViewerDestroy(&vtk);
 
The result seems correct except for the rotation order of the nodes (see the red lines on gmsh and vtk file respectively). Then, I visualized my vtk file with paraview, and I remarked that my geometry is not correct and not conserved when comparing it with my gmsh file. So, I didn?t understand why the rotation order of nodes  is not conserved when saving my result to a vtk file?
 
Other information used:
- gmsh format 2.2 
- Vtk version: 7.1.1
- Petsc version: 3.13/opt 

Below my two files gmsh and vtk:

Gmsh file:
$MeshFormat
2.2 0 8
$EndMeshFormat
$Nodes
12
1 0.0 10.0 10.0
2 0.0 0.0 10.0
3 0.0 0.0 0.0
4 0.0 10.0 0.0
5 10.0 10.0 10.0
6 10.0 0.0 10.0
7 10.0 0.0 0.0
8 10.0 10.0 0.0
9 20.0 10.0 10.0
10 20.0 0.0 10.0
11 20.0 0.0 0.0
12 20.0 10.0 0.0
$EndNodes
$Elements
2
1 5 2 68 60 1 2 3 4 5 6 7 8
2 5 2 68 60 5 6 7 8 9 10 11 12
$EndElements

Vtk file :
# vtk DataFile Version 2.0
Simplicial Mesh Example
ASCII
DATASET UNSTRUCTURED_GRID
POINTS 12 double
0.000000e+00 1.000000e+01 1.000000e+01
0.000000e+00 0.000000e+00 1.000000e+01
0.000000e+00 0.000000e+00 0.000000e+00
0.000000e+00 1.000000e+01 0.000000e+00
1.000000e+01 1.000000e+01 1.000000e+01
1.000000e+01 0.000000e+00 1.000000e+01
1.000000e+01 0.000000e+00 0.000000e+00
1.000000e+01 1.000000e+01 0.000000e+00
2.000000e+01 1.000000e+01 1.000000e+01
2.000000e+01 0.000000e+00 1.000000e+01
2.000000e+01 0.000000e+00 0.000000e+00
2.000000e+01 1.000000e+01 0.000000e+00
CELLS 2 18
8  0 3 2 1 4 5 6 7
8  4 7 6 5 8 9 10 11
CELL_TYPES 2
12
12
POINT_DATA 12
VECTORS dU_x double
2.754808e-10 -8.653846e-11 -8.653846e-11
2.754808e-10 8.653846e-11 -8.653846e-11
2.754808e-10 8.653846e-11 8.653846e-11
2.754808e-10 -8.653846e-11 8.653846e-11
4.678571e-01 -9.107143e-02 -9.107143e-02
4.678571e-01 9.107143e-02 -9.107143e-02
4.678571e-01 9.107143e-02 9.107143e-02
4.678571e-01 -9.107143e-02 9.107143e-02
1.000000e+00 -7.500000e-02 -7.500000e-02
1.000000e+00 7.500000e-02 -7.500000e-02
1.000000e+00 7.500000e-02 7.500000e-02
1.000000e+00 -7.500000e-02 7.500000e-02

Thank you in advance and have a good day !

Sami,

--
Dr. Sami BEN ELHAJ SALAH
Ing?nieur de Recherche (CNRS)
Institut Pprime - ISAE - ENSMA
Mobile: 06.62.51.26.74
Email: sami.ben-elhaj-salah at ensma.fr
www.samibenelhajsalah.com <https://samiben91.github.io/samibenelhajsalah/index.html>



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220608/3cab6d95/attachment-0001.html>

From jed at jedbrown.org  Wed Jun  8 08:37:14 2022
From: jed at jedbrown.org (Jed Brown)
Date: Wed, 08 Jun 2022 07:37:14 -0600
Subject: [petsc-users] Writing VTK output
In-Reply-To: <D72073D3-62B5-4151-9B11-E3ED40F32B01@ensma.fr>
References: <D72073D3-62B5-4151-9B11-E3ED40F32B01@ensma.fr>
Message-ID: <87czfje0ol.fsf@jedbrown.org>

You're using pretty old versions of all software; I'd recommend upgrading. I recommend choosing the file name "solution.vtu" to use the modern (non-legacy) format. Does that work for you?

Sami BEN ELHAJ SALAH <sami.ben-elhaj-salah at ensma.fr> writes:

> Dear Petsc Developer team,
>
> I solved a linear elastic problem in 3D using a DMPLEX. My system is converging, then I would like to write out my solution vector to a vtk file where I use unstructured mesh. Currently, I tried two algorithms and I have the same result.
>
> 1) Algorithm 1    
> err = SNESSolve(_snes, bc_vec_test, solution);
> CHKERRABORT(FOX::Parallel::COMM_WORLD,err);
> PetscViewer vtk; 
> PetscViewerVTKOpen(FOX::Parallel::COMM_WORLD,"solution.vtk",FILE_MODE_WRITE,&vtk); 
> VecView(solution,vtk);
> PetscViewerDestroy(&vtk);
>
>
> 2) Algorithm 2
> err = SNESSolve(_snes, bc_vec_test, solution);
> CHKERRABORT(FOX::Parallel::COMM_WORLD,err);
> PetscViewer vtk; 
> PetscViewerCreate(FOX::Parallel::COMM_WORLD, &vtk); 
> PetscViewerSetType(vtk, PETSCVIEWERVTK); 
> PetscViewerFileSetName(vtk, "sol.vtk"); 
> VecView(solution, vtk); 
> PetscViewerDestroy(&vtk);
>  
> The result seems correct except for the rotation order of the nodes (see the red lines on gmsh and vtk file respectively). Then, I visualized my vtk file with paraview, and I remarked that my geometry is not correct and not conserved when comparing it with my gmsh file. So, I didn?t understand why the rotation order of nodes  is not conserved when saving my result to a vtk file?
>  
> Other information used:
> - gmsh format 2.2 
> - Vtk version: 7.1.1
> - Petsc version: 3.13/opt 
>
> Below my two files gmsh and vtk:
>
> Gmsh file:
> $MeshFormat
> 2.2 0 8
> $EndMeshFormat
> $Nodes
> 12
> 1 0.0 10.0 10.0
> 2 0.0 0.0 10.0
> 3 0.0 0.0 0.0
> 4 0.0 10.0 0.0
> 5 10.0 10.0 10.0
> 6 10.0 0.0 10.0
> 7 10.0 0.0 0.0
> 8 10.0 10.0 0.0
> 9 20.0 10.0 10.0
> 10 20.0 0.0 10.0
> 11 20.0 0.0 0.0
> 12 20.0 10.0 0.0
> $EndNodes
> $Elements
> 2
> 1 5 2 68 60 1 2 3 4 5 6 7 8
> 2 5 2 68 60 5 6 7 8 9 10 11 12
> $EndElements
>
> Vtk file :
> # vtk DataFile Version 2.0
> Simplicial Mesh Example
> ASCII
> DATASET UNSTRUCTURED_GRID
> POINTS 12 double
> 0.000000e+00 1.000000e+01 1.000000e+01
> 0.000000e+00 0.000000e+00 1.000000e+01
> 0.000000e+00 0.000000e+00 0.000000e+00
> 0.000000e+00 1.000000e+01 0.000000e+00
> 1.000000e+01 1.000000e+01 1.000000e+01
> 1.000000e+01 0.000000e+00 1.000000e+01
> 1.000000e+01 0.000000e+00 0.000000e+00
> 1.000000e+01 1.000000e+01 0.000000e+00
> 2.000000e+01 1.000000e+01 1.000000e+01
> 2.000000e+01 0.000000e+00 1.000000e+01
> 2.000000e+01 0.000000e+00 0.000000e+00
> 2.000000e+01 1.000000e+01 0.000000e+00
> CELLS 2 18
> 8  0 3 2 1 4 5 6 7
> 8  4 7 6 5 8 9 10 11
> CELL_TYPES 2
> 12
> 12
> POINT_DATA 12
> VECTORS dU_x double
> 2.754808e-10 -8.653846e-11 -8.653846e-11
> 2.754808e-10 8.653846e-11 -8.653846e-11
> 2.754808e-10 8.653846e-11 8.653846e-11
> 2.754808e-10 -8.653846e-11 8.653846e-11
> 4.678571e-01 -9.107143e-02 -9.107143e-02
> 4.678571e-01 9.107143e-02 -9.107143e-02
> 4.678571e-01 9.107143e-02 9.107143e-02
> 4.678571e-01 -9.107143e-02 9.107143e-02
> 1.000000e+00 -7.500000e-02 -7.500000e-02
> 1.000000e+00 7.500000e-02 -7.500000e-02
> 1.000000e+00 7.500000e-02 7.500000e-02
> 1.000000e+00 -7.500000e-02 7.500000e-02
>
> Thank you in advance and have a good day !
>
> Sami,
>
> --
> Dr. Sami BEN ELHAJ SALAH
> Ing?nieur de Recherche (CNRS)
> Institut Pprime - ISAE - ENSMA
> Mobile: 06.62.51.26.74
> Email: sami.ben-elhaj-salah at ensma.fr
> www.samibenelhajsalah.com <https://samiben91.github.io/samibenelhajsalah/index.html>

From sami.ben-elhaj-salah at ensma.fr  Wed Jun  8 09:14:13 2022
From: sami.ben-elhaj-salah at ensma.fr (Sami BEN ELHAJ SALAH)
Date: Wed, 8 Jun 2022 16:14:13 +0200
Subject: [petsc-users] Writing VTK output
In-Reply-To: <87czfje0ol.fsf@jedbrown.org>
References: <D72073D3-62B5-4151-9B11-E3ED40F32B01@ensma.fr>
	<87czfje0ol.fsf@jedbrown.org>
Message-ID: <CDC93C32-305A-42FD-881A-5548E2E1C532@ensma.fr>

Hi Jed,

Thank you for your answer.

When I use a  ??solution.vtu'', I obtain a wrong file. 

<?xml version="1.0"?>
<VTKFile type="UnstructuredGrid" version="0.1" byte_order="LittleEndian">
  <UnstructuredGrid>
    <Piece NumberOfPoints="12" NumberOfCells="2">
      <Points>
        <DataArray type="Float64" Name="Position" NumberOfComponents="3" format="appended" offset="0" />
      </Points>
      <Cells>
        <DataArray type="Int32" Name="connectivity" NumberOfComponents="1" format="appended" offset="292" />
        <DataArray type="Int32" Name="offsets"      NumberOfComponents="1" format="appended" offset="360" />
        <DataArray type="UInt8" Name="types"        NumberOfComponents="1" format="appended" offset="372" />
      </Cells>
      <CellData>
        <DataArray type="Int32" Name="Rank" NumberOfComponents="1" format="appended" offset="378" />
      </CellData>
      <PointData>
        <DataArray type="Float64" Name="dU_x(null)" NumberOfComponents="3" format="appended" offset="390" />
      </PointData>
    </Piece>
  </UnstructuredGrid>
  <AppendedData encoding="raw">
_                 $@      $@                      $@                                      $@              $@      $@      $@      $@              $@      $@                      $@      $@              4@      $@      $@      4@              $@      4@                      4@      $@        @                                           	   
                                ?p?O??=??sT?????sT????p?O??=??sT???=??sT????p?O??=??sT???=??sT???=?p?O??=??sT?????sT???=o _?????uP????uP??o _?????uP????uP??
o _?????uP????uP??
o _?????uP????uP??b#???????333????333??_#??????	?333????333??b#??????(?333??'?333??a#???????333??>?333??
  </AppendedData>
</VTKFile>


If I understand your answer, to solve my problem, should just upgrade all my software ?

Thanks,
Sami,


--
Dr. Sami BEN ELHAJ SALAH
Ing?nieur de Recherche (CNRS)
Institut Pprime - ISAE - ENSMA
Mobile: 06.62.51.26.74
Email: sami.ben-elhaj-salah at ensma.fr
www.samibenelhajsalah.com <https://samiben91.github.io/samibenelhajsalah/index.html>



> Le 8 juin 2022 ? 15:37, Jed Brown <jed at jedbrown.org> a ?crit :
> 
> You're using pretty old versions of all software; I'd recommend upgrading. I recommend choosing the file name "solution.vtu" to use the modern (non-legacy) format. Does that work for you?
> 
> Sami BEN ELHAJ SALAH <sami.ben-elhaj-salah at ensma.fr> writes:
> 
>> Dear Petsc Developer team,
>> 
>> I solved a linear elastic problem in 3D using a DMPLEX. My system is converging, then I would like to write out my solution vector to a vtk file where I use unstructured mesh. Currently, I tried two algorithms and I have the same result.
>> 
>> 1) Algorithm 1    
>> err = SNESSolve(_snes, bc_vec_test, solution);
>> CHKERRABORT(FOX::Parallel::COMM_WORLD,err);
>> PetscViewer vtk; 
>> PetscViewerVTKOpen(FOX::Parallel::COMM_WORLD,"solution.vtk",FILE_MODE_WRITE,&vtk); 
>> VecView(solution,vtk);
>> PetscViewerDestroy(&vtk);
>> 
>> 
>> 2) Algorithm 2
>> err = SNESSolve(_snes, bc_vec_test, solution);
>> CHKERRABORT(FOX::Parallel::COMM_WORLD,err);
>> PetscViewer vtk; 
>> PetscViewerCreate(FOX::Parallel::COMM_WORLD, &vtk); 
>> PetscViewerSetType(vtk, PETSCVIEWERVTK); 
>> PetscViewerFileSetName(vtk, "sol.vtk"); 
>> VecView(solution, vtk); 
>> PetscViewerDestroy(&vtk);
>> 
>> The result seems correct except for the rotation order of the nodes (see the red lines on gmsh and vtk file respectively). Then, I visualized my vtk file with paraview, and I remarked that my geometry is not correct and not conserved when comparing it with my gmsh file. So, I didn?t understand why the rotation order of nodes  is not conserved when saving my result to a vtk file?
>> 
>> Other information used:
>> - gmsh format 2.2 
>> - Vtk version: 7.1.1
>> - Petsc version: 3.13/opt 
>> 
>> Below my two files gmsh and vtk:
>> 
>> Gmsh file:
>> $MeshFormat
>> 2.2 0 8
>> $EndMeshFormat
>> $Nodes
>> 12
>> 1 0.0 10.0 10.0
>> 2 0.0 0.0 10.0
>> 3 0.0 0.0 0.0
>> 4 0.0 10.0 0.0
>> 5 10.0 10.0 10.0
>> 6 10.0 0.0 10.0
>> 7 10.0 0.0 0.0
>> 8 10.0 10.0 0.0
>> 9 20.0 10.0 10.0
>> 10 20.0 0.0 10.0
>> 11 20.0 0.0 0.0
>> 12 20.0 10.0 0.0
>> $EndNodes
>> $Elements
>> 2
>> 1 5 2 68 60 1 2 3 4 5 6 7 8
>> 2 5 2 68 60 5 6 7 8 9 10 11 12
>> $EndElements
>> 
>> Vtk file :
>> # vtk DataFile Version 2.0
>> Simplicial Mesh Example
>> ASCII
>> DATASET UNSTRUCTURED_GRID
>> POINTS 12 double
>> 0.000000e+00 1.000000e+01 1.000000e+01
>> 0.000000e+00 0.000000e+00 1.000000e+01
>> 0.000000e+00 0.000000e+00 0.000000e+00
>> 0.000000e+00 1.000000e+01 0.000000e+00
>> 1.000000e+01 1.000000e+01 1.000000e+01
>> 1.000000e+01 0.000000e+00 1.000000e+01
>> 1.000000e+01 0.000000e+00 0.000000e+00
>> 1.000000e+01 1.000000e+01 0.000000e+00
>> 2.000000e+01 1.000000e+01 1.000000e+01
>> 2.000000e+01 0.000000e+00 1.000000e+01
>> 2.000000e+01 0.000000e+00 0.000000e+00
>> 2.000000e+01 1.000000e+01 0.000000e+00
>> CELLS 2 18
>> 8  0 3 2 1 4 5 6 7
>> 8  4 7 6 5 8 9 10 11
>> CELL_TYPES 2
>> 12
>> 12
>> POINT_DATA 12
>> VECTORS dU_x double
>> 2.754808e-10 -8.653846e-11 -8.653846e-11
>> 2.754808e-10 8.653846e-11 -8.653846e-11
>> 2.754808e-10 8.653846e-11 8.653846e-11
>> 2.754808e-10 -8.653846e-11 8.653846e-11
>> 4.678571e-01 -9.107143e-02 -9.107143e-02
>> 4.678571e-01 9.107143e-02 -9.107143e-02
>> 4.678571e-01 9.107143e-02 9.107143e-02
>> 4.678571e-01 -9.107143e-02 9.107143e-02
>> 1.000000e+00 -7.500000e-02 -7.500000e-02
>> 1.000000e+00 7.500000e-02 -7.500000e-02
>> 1.000000e+00 7.500000e-02 7.500000e-02
>> 1.000000e+00 -7.500000e-02 7.500000e-02
>> 
>> Thank you in advance and have a good day !
>> 
>> Sami,
>> 
>> --
>> Dr. Sami BEN ELHAJ SALAH
>> Ing?nieur de Recherche (CNRS)
>> Institut Pprime - ISAE - ENSMA
>> Mobile: 06.62.51.26.74
>> Email: sami.ben-elhaj-salah at ensma.fr
>> www.samibenelhajsalah.com <https://samiben91.github.io/samibenelhajsalah/index.html>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220608/247f274d/attachment-0001.html>

From jed at jedbrown.org  Wed Jun  8 09:25:51 2022
From: jed at jedbrown.org (Jed Brown)
Date: Wed, 08 Jun 2022 08:25:51 -0600
Subject: [petsc-users] Writing VTK output
In-Reply-To: <CDC93C32-305A-42FD-881A-5548E2E1C532@ensma.fr>
References: <D72073D3-62B5-4151-9B11-E3ED40F32B01@ensma.fr>
	<87czfje0ol.fsf@jedbrown.org>
	<CDC93C32-305A-42FD-881A-5548E2E1C532@ensma.fr>
Message-ID: <875ylbdyfk.fsf@jedbrown.org>

Does the file load in paraview? When I load your *.msh in a tutorial with -dm_plex_filename sami.msh -dm_view vtk:sami.vtu, I get this good output.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: sami.vtu
Type: model/vnd.vtu
Size: 1319 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220608/37baae18/attachment-0001.vtu>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sami.png
Type: image/png
Size: 35231 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220608/37baae18/attachment-0001.png>
-------------- next part --------------

Sami BEN ELHAJ SALAH <sami.ben-elhaj-salah at ensma.fr> writes:

> Hi Jed,
>
> Thank you for your answer.
>
> When I use a  ??solution.vtu'', I obtain a wrong file. 
>
> <?xml version="1.0"?>
> <VTKFile type="UnstructuredGrid" version="0.1" byte_order="LittleEndian">
>   <UnstructuredGrid>
>     <Piece NumberOfPoints="12" NumberOfCells="2">
>       <Points>
>         <DataArray type="Float64" Name="Position" NumberOfComponents="3" format="appended" offset="0" />
>       </Points>
>       <Cells>
>         <DataArray type="Int32" Name="connectivity" NumberOfComponents="1" format="appended" offset="292" />
>         <DataArray type="Int32" Name="offsets"      NumberOfComponents="1" format="appended" offset="360" />
>         <DataArray type="UInt8" Name="types"        NumberOfComponents="1" format="appended" offset="372" />
>       </Cells>
>       <CellData>
>         <DataArray type="Int32" Name="Rank" NumberOfComponents="1" format="appended" offset="378" />
>       </CellData>
>       <PointData>
>         <DataArray type="Float64" Name="dU_x(null)" NumberOfComponents="3" format="appended" offset="390" />
>       </PointData>
>     </Piece>
>   </UnstructuredGrid>
>   <AppendedData encoding="raw">
> _         $@   $@           $@                   $@       $@   $@   $@   $@       $@   $@           $@   $@       4@   $@   $@   4@       $@   4@           4@   $@    @               	 
>              ?p?O??=??sT?????sT????p?O??=??sT???=??sT????p?O??=??sT???=??sT???=?p?O??=??sT?????sT???=o_?????uP????uP??o_?????uP????uP??o_?????uP????uP??o_?????uP????uP??b#???????333????333??_#??????	?333????333??b#??????(?333??'?333??a#???????333??>?333??
>   </AppendedData>
> </VTKFile>
>
>
> If I understand your answer, to solve my problem, should just upgrade all my software ?
>
> Thanks,
> Sami,
>
>
> --
> Dr. Sami BEN ELHAJ SALAH
> Ing?nieur de Recherche (CNRS)
> Institut Pprime - ISAE - ENSMA
> Mobile: 06.62.51.26.74
> Email: sami.ben-elhaj-salah at ensma.fr
> www.samibenelhajsalah.com <https://samiben91.github.io/samibenelhajsalah/index.html>
>
>
>
>> Le 8 juin 2022 ? 15:37, Jed Brown <jed at jedbrown.org> a ?crit :
>> 
>> You're using pretty old versions of all software; I'd recommend upgrading. I recommend choosing the file name "solution.vtu" to use the modern (non-legacy) format. Does that work for you?
>> 
>> Sami BEN ELHAJ SALAH <sami.ben-elhaj-salah at ensma.fr> writes:
>> 
>>> Dear Petsc Developer team,
>>> 
>>> I solved a linear elastic problem in 3D using a DMPLEX. My system is converging, then I would like to write out my solution vector to a vtk file where I use unstructured mesh. Currently, I tried two algorithms and I have the same result.
>>> 
>>> 1) Algorithm 1    
>>> err = SNESSolve(_snes, bc_vec_test, solution);
>>> CHKERRABORT(FOX::Parallel::COMM_WORLD,err);
>>> PetscViewer vtk; 
>>> PetscViewerVTKOpen(FOX::Parallel::COMM_WORLD,"solution.vtk",FILE_MODE_WRITE,&vtk); 
>>> VecView(solution,vtk);
>>> PetscViewerDestroy(&vtk);
>>> 
>>> 
>>> 2) Algorithm 2
>>> err = SNESSolve(_snes, bc_vec_test, solution);
>>> CHKERRABORT(FOX::Parallel::COMM_WORLD,err);
>>> PetscViewer vtk; 
>>> PetscViewerCreate(FOX::Parallel::COMM_WORLD, &vtk); 
>>> PetscViewerSetType(vtk, PETSCVIEWERVTK); 
>>> PetscViewerFileSetName(vtk, "sol.vtk"); 
>>> VecView(solution, vtk); 
>>> PetscViewerDestroy(&vtk);
>>> 
>>> The result seems correct except for the rotation order of the nodes (see the red lines on gmsh and vtk file respectively). Then, I visualized my vtk file with paraview, and I remarked that my geometry is not correct and not conserved when comparing it with my gmsh file. So, I didn?t understand why the rotation order of nodes  is not conserved when saving my result to a vtk file?
>>> 
>>> Other information used:
>>> - gmsh format 2.2 
>>> - Vtk version: 7.1.1
>>> - Petsc version: 3.13/opt 
>>> 
>>> Below my two files gmsh and vtk:
>>> 
>>> Gmsh file:
>>> $MeshFormat
>>> 2.2 0 8
>>> $EndMeshFormat
>>> $Nodes
>>> 12
>>> 1 0.0 10.0 10.0
>>> 2 0.0 0.0 10.0
>>> 3 0.0 0.0 0.0
>>> 4 0.0 10.0 0.0
>>> 5 10.0 10.0 10.0
>>> 6 10.0 0.0 10.0
>>> 7 10.0 0.0 0.0
>>> 8 10.0 10.0 0.0
>>> 9 20.0 10.0 10.0
>>> 10 20.0 0.0 10.0
>>> 11 20.0 0.0 0.0
>>> 12 20.0 10.0 0.0
>>> $EndNodes
>>> $Elements
>>> 2
>>> 1 5 2 68 60 1 2 3 4 5 6 7 8
>>> 2 5 2 68 60 5 6 7 8 9 10 11 12
>>> $EndElements
>>> 
>>> Vtk file :
>>> # vtk DataFile Version 2.0
>>> Simplicial Mesh Example
>>> ASCII
>>> DATASET UNSTRUCTURED_GRID
>>> POINTS 12 double
>>> 0.000000e+00 1.000000e+01 1.000000e+01
>>> 0.000000e+00 0.000000e+00 1.000000e+01
>>> 0.000000e+00 0.000000e+00 0.000000e+00
>>> 0.000000e+00 1.000000e+01 0.000000e+00
>>> 1.000000e+01 1.000000e+01 1.000000e+01
>>> 1.000000e+01 0.000000e+00 1.000000e+01
>>> 1.000000e+01 0.000000e+00 0.000000e+00
>>> 1.000000e+01 1.000000e+01 0.000000e+00
>>> 2.000000e+01 1.000000e+01 1.000000e+01
>>> 2.000000e+01 0.000000e+00 1.000000e+01
>>> 2.000000e+01 0.000000e+00 0.000000e+00
>>> 2.000000e+01 1.000000e+01 0.000000e+00
>>> CELLS 2 18
>>> 8  0 3 2 1 4 5 6 7
>>> 8  4 7 6 5 8 9 10 11
>>> CELL_TYPES 2
>>> 12
>>> 12
>>> POINT_DATA 12
>>> VECTORS dU_x double
>>> 2.754808e-10 -8.653846e-11 -8.653846e-11
>>> 2.754808e-10 8.653846e-11 -8.653846e-11
>>> 2.754808e-10 8.653846e-11 8.653846e-11
>>> 2.754808e-10 -8.653846e-11 8.653846e-11
>>> 4.678571e-01 -9.107143e-02 -9.107143e-02
>>> 4.678571e-01 9.107143e-02 -9.107143e-02
>>> 4.678571e-01 9.107143e-02 9.107143e-02
>>> 4.678571e-01 -9.107143e-02 9.107143e-02
>>> 1.000000e+00 -7.500000e-02 -7.500000e-02
>>> 1.000000e+00 7.500000e-02 -7.500000e-02
>>> 1.000000e+00 7.500000e-02 7.500000e-02
>>> 1.000000e+00 -7.500000e-02 7.500000e-02
>>> 
>>> Thank you in advance and have a good day !
>>> 
>>> Sami,
>>> 
>>> --
>>> Dr. Sami BEN ELHAJ SALAH
>>> Ing?nieur de Recherche (CNRS)
>>> Institut Pprime - ISAE - ENSMA
>>> Mobile: 06.62.51.26.74
>>> Email: sami.ben-elhaj-salah at ensma.fr
>>> www.samibenelhajsalah.com <https://samiben91.github.io/samibenelhajsalah/index.html>

From sami.ben-elhaj-salah at ensma.fr  Wed Jun  8 10:24:15 2022
From: sami.ben-elhaj-salah at ensma.fr (Sami BEN ELHAJ SALAH)
Date: Wed, 8 Jun 2022 17:24:15 +0200
Subject: [petsc-users] Writing VTK output
In-Reply-To: <875ylbdyfk.fsf@jedbrown.org>
References: <D72073D3-62B5-4151-9B11-E3ED40F32B01@ensma.fr>
	<87czfje0ol.fsf@jedbrown.org>
	<CDC93C32-305A-42FD-881A-5548E2E1C532@ensma.fr>
	<875ylbdyfk.fsf@jedbrown.org>
Message-ID: <7A2FB4C2-CA56-4D7D-9162-4574E14032C6@ensma.fr>

Yes, the file "sami.vtu" is loaded correctly in paraview and I have the good output like you.

In my code, I tried with the same command given in your last answer and I still have the wrong .vtu file.
I use this:
mpirun -np 1 /home/benelhasa/fox_petsc/build_test/bin/Debug/FoXtroT -snes_test_jacobian_view -snes_converged_reason -snes_monitor -ksp_monitor -ksp_xmonitor -dm_plex_filename cub_2C3D8_msh.msh -dm_view vtk:cub_2C3D8_msh.vtu cub_8C3D8.fxt


Thanks,
Sami,

--
Dr. Sami BEN ELHAJ SALAH
Ing?nieur de Recherche (CNRS)
Institut Pprime - ISAE - ENSMA
Mobile: 06.62.51.26.74
Email: sami.ben-elhaj-salah at ensma.fr
www.samibenelhajsalah.com <https://samiben91.github.io/samibenelhajsalah/index.html>



> Le 8 juin 2022 ? 16:25, Jed Brown <jed at jedbrown.org> a ?crit :
> 
> Does the file load in paraview? When I load your *.msh in a tutorial with -dm_plex_filename sami.msh -dm_view vtk:sami.vtu, I get this good output.
> 
> <sami.vtu><sami.png>
> Sami BEN ELHAJ SALAH <sami.ben-elhaj-salah at ensma.fr <mailto:sami.ben-elhaj-salah at ensma.fr>> writes:
> 
>> Hi Jed,
>> 
>> Thank you for your answer.
>> 
>> When I use a ??solution.vtu'', I obtain a wrong file. 
>> 
>> <?xml version="1.0"?>
>> <VTKFile type="UnstructuredGrid" version="0.1" byte_order="LittleEndian">
>> <UnstructuredGrid>
>> <Piece NumberOfPoints="12" NumberOfCells="2">
>> <Points>
>> <DataArray type="Float64" Name="Position" NumberOfComponents="3" format="appended" offset="0" />
>> </Points>
>> <Cells>
>> <DataArray type="Int32" Name="connectivity" NumberOfComponents="1" format="appended" offset="292" />
>> <DataArray type="Int32" Name="offsets" NumberOfComponents="1" format="appended" offset="360" />
>> <DataArray type="UInt8" Name="types" NumberOfComponents="1" format="appended" offset="372" />
>> </Cells>
>> <CellData>
>> <DataArray type="Int32" Name="Rank" NumberOfComponents="1" format="appended" offset="378" />
>> </CellData>
>> <PointData>
>> <DataArray type="Float64" Name="dU_x(null)" NumberOfComponents="3" format="appended" offset="390" />
>> </PointData>
>> </Piece>
>> </UnstructuredGrid>
>> <AppendedData encoding="raw">
>> _ $@$@$@$@$@$@$@$@$@$@$@$@4@$@$@4@$@4 at 4@$@@	
>>  ?p?O??=??sT?????sT????p?O??=??sT???=??sT????p?O??=??sT???=??sT???=?p?O??=??sT?????sT???=o_?????uP????uP??o_?????uP????uP??o_?????uP????uP??o_?????uP????uP??b#???????333????333??_#??????	?333????333??b#??????(?333??'?333??a#???????333??>?333??
>> </AppendedData>
>> </VTKFile>
>> 
>> 
>> If I understand your answer, to solve my problem, should just upgrade all my software ?
>> 
>> Thanks,
>> Sami,
>> 
>> 
>> --
>> Dr. Sami BEN ELHAJ SALAH
>> Ing?nieur de Recherche (CNRS)
>> Institut Pprime - ISAE - ENSMA
>> Mobile: 06.62.51.26.74
>> Email: sami.ben-elhaj-salah at ensma.fr
>> www.samibenelhajsalah.com <http://www.samibenelhajsalah.com/> <https://samiben91.github.io/samibenelhajsalah/index.html <https://samiben91.github.io/samibenelhajsalah/index.html>>
>> 
>> 
>> 
>>> Le 8 juin 2022 ? 15:37, Jed Brown <jed at jedbrown.org> a ?crit :
>>> 
>>> You're using pretty old versions of all software; I'd recommend upgrading. I recommend choosing the file name "solution.vtu" to use the modern (non-legacy) format. Does that work for you?
>>> 
>>> Sami BEN ELHAJ SALAH <sami.ben-elhaj-salah at ensma.fr> writes:
>>> 
>>>> Dear Petsc Developer team,
>>>> 
>>>> I solved a linear elastic problem in 3D using a DMPLEX. My system is converging, then I would like to write out my solution vector to a vtk file where I use unstructured mesh. Currently, I tried two algorithms and I have the same result.
>>>> 
>>>> 1) Algorithm 1 
>>>> err = SNESSolve(_snes, bc_vec_test, solution);
>>>> CHKERRABORT(FOX::Parallel::COMM_WORLD,err);
>>>> PetscViewer vtk; 
>>>> PetscViewerVTKOpen(FOX::Parallel::COMM_WORLD,"solution.vtk",FILE_MODE_WRITE,&vtk); 
>>>> VecView(solution,vtk);
>>>> PetscViewerDestroy(&vtk);
>>>> 
>>>> 
>>>> 2) Algorithm 2
>>>> err = SNESSolve(_snes, bc_vec_test, solution);
>>>> CHKERRABORT(FOX::Parallel::COMM_WORLD,err);
>>>> PetscViewer vtk; 
>>>> PetscViewerCreate(FOX::Parallel::COMM_WORLD, &vtk); 
>>>> PetscViewerSetType(vtk, PETSCVIEWERVTK); 
>>>> PetscViewerFileSetName(vtk, "sol.vtk"); 
>>>> VecView(solution, vtk); 
>>>> PetscViewerDestroy(&vtk);
>>>> 
>>>> The result seems correct except for the rotation order of the nodes (see the red lines on gmsh and vtk file respectively). Then, I visualized my vtk file with paraview, and I remarked that my geometry is not correct and not conserved when comparing it with my gmsh file. So, I didn?t understand why the rotation order of nodes is not conserved when saving my result to a vtk file?
>>>> 
>>>> Other information used:
>>>> - gmsh format 2.2 
>>>> - Vtk version: 7.1.1
>>>> - Petsc version: 3.13/opt 
>>>> 
>>>> Below my two files gmsh and vtk:
>>>> 
>>>> Gmsh file:
>>>> $MeshFormat
>>>> 2.2 0 8
>>>> $EndMeshFormat
>>>> $Nodes
>>>> 12
>>>> 1 0.0 10.0 10.0
>>>> 2 0.0 0.0 10.0
>>>> 3 0.0 0.0 0.0
>>>> 4 0.0 10.0 0.0
>>>> 5 10.0 10.0 10.0
>>>> 6 10.0 0.0 10.0
>>>> 7 10.0 0.0 0.0
>>>> 8 10.0 10.0 0.0
>>>> 9 20.0 10.0 10.0
>>>> 10 20.0 0.0 10.0
>>>> 11 20.0 0.0 0.0
>>>> 12 20.0 10.0 0.0
>>>> $EndNodes
>>>> $Elements
>>>> 2
>>>> 1 5 2 68 60 1 2 3 4 5 6 7 8
>>>> 2 5 2 68 60 5 6 7 8 9 10 11 12
>>>> $EndElements
>>>> 
>>>> Vtk file :
>>>> # vtk DataFile Version 2.0
>>>> Simplicial Mesh Example
>>>> ASCII
>>>> DATASET UNSTRUCTURED_GRID
>>>> POINTS 12 double
>>>> 0.000000e+00 1.000000e+01 1.000000e+01
>>>> 0.000000e+00 0.000000e+00 1.000000e+01
>>>> 0.000000e+00 0.000000e+00 0.000000e+00
>>>> 0.000000e+00 1.000000e+01 0.000000e+00
>>>> 1.000000e+01 1.000000e+01 1.000000e+01
>>>> 1.000000e+01 0.000000e+00 1.000000e+01
>>>> 1.000000e+01 0.000000e+00 0.000000e+00
>>>> 1.000000e+01 1.000000e+01 0.000000e+00
>>>> 2.000000e+01 1.000000e+01 1.000000e+01
>>>> 2.000000e+01 0.000000e+00 1.000000e+01
>>>> 2.000000e+01 0.000000e+00 0.000000e+00
>>>> 2.000000e+01 1.000000e+01 0.000000e+00
>>>> CELLS 2 18
>>>> 8 0 3 2 1 4 5 6 7
>>>> 8 4 7 6 5 8 9 10 11
>>>> CELL_TYPES 2
>>>> 12
>>>> 12
>>>> POINT_DATA 12
>>>> VECTORS dU_x double
>>>> 2.754808e-10 -8.653846e-11 -8.653846e-11
>>>> 2.754808e-10 8.653846e-11 -8.653846e-11
>>>> 2.754808e-10 8.653846e-11 8.653846e-11
>>>> 2.754808e-10 -8.653846e-11 8.653846e-11
>>>> 4.678571e-01 -9.107143e-02 -9.107143e-02
>>>> 4.678571e-01 9.107143e-02 -9.107143e-02
>>>> 4.678571e-01 9.107143e-02 9.107143e-02
>>>> 4.678571e-01 -9.107143e-02 9.107143e-02
>>>> 1.000000e+00 -7.500000e-02 -7.500000e-02
>>>> 1.000000e+00 7.500000e-02 -7.500000e-02
>>>> 1.000000e+00 7.500000e-02 7.500000e-02
>>>> 1.000000e+00 -7.500000e-02 7.500000e-02
>>>> 
>>>> Thank you in advance and have a good day !
>>>> 
>>>> Sami,
>>>> 
>>>> --
>>>> Dr. Sami BEN ELHAJ SALAH
>>>> Ing?nieur de Recherche (CNRS)
>>>> Institut Pprime - ISAE - ENSMA
>>>> Mobile: 06.62.51.26.74
>>>> Email: sami.ben-elhaj-salah at ensma.fr
>>>> www.samibenelhajsalah.com <https://samiben91.github.io/samibenelhajsalah/index.html>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220608/65199865/attachment.html>

From knepley at gmail.com  Wed Jun  8 10:57:47 2022
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 8 Jun 2022 11:57:47 -0400
Subject: [petsc-users] Writing VTK output
In-Reply-To: <7A2FB4C2-CA56-4D7D-9162-4574E14032C6@ensma.fr>
References: <D72073D3-62B5-4151-9B11-E3ED40F32B01@ensma.fr>
	<87czfje0ol.fsf@jedbrown.org>
	<CDC93C32-305A-42FD-881A-5548E2E1C532@ensma.fr>
	<875ylbdyfk.fsf@jedbrown.org>
	<7A2FB4C2-CA56-4D7D-9162-4574E14032C6@ensma.fr>
Message-ID: <CAMYG4GkjWD8=FWub8kFOcQZnU8gKW6_udB8Nt=t8J+TyTjoeCw@mail.gmail.com>

On Wed, Jun 8, 2022 at 11:24 AM Sami BEN ELHAJ SALAH <
sami.ben-elhaj-salah at ensma.fr> wrote:

> Yes, the file "sami.vtu" is loaded correctly in paraview and I have the
> good output like you.
>
> In my code, I tried with the same command given in your last answer and I
> still have the wrong .vtu file.
>

Hi Sami,

What do you mean by wrong?

Can you just use the simple procedure:

  PetscCall(DMCreate(comm, dm));
  PetscCall(DMSetType(*dm, DMPLEX));
  PetscCall(DMSetFromOptions(*dm));
  PetscCall(DMViewFromOptions(*dm, NULL, "-dm_view"));

This is the one that works for us. Then we can change it in your code one
step at a time until you get what you need.

  Thanks,

    Matt


> I use this:
> mpirun -np 1 /home/benelhasa/fox_petsc/build_test/bin/Debug/FoXtroT
> -snes_test_jacobian_view -snes_converged_reason -snes_monitor -ksp_monitor
> -ksp_xmonitor -dm_plex_filename cub_2C3D8_msh.msh -dm_view
> vtk:cub_2C3D8_msh.vtu cub_8C3D8.fxt
>
>
> Thanks,
> Sami,
>
> --
> Dr. Sami BEN ELHAJ SALAH
> Ing?nieur de Recherche (CNRS)
> Institut Pprime - ISAE - ENSMA
> Mobile: 06.62.51.26.74
> Email: sami.ben-elhaj-salah at ensma.fr <sami.ben-elhaj-salah at ensma.fr>
> www.samibenelhajsalah.com
> <https://samiben91.github.io/samibenelhajsalah/index.html>
>
>
>
> Le 8 juin 2022 ? 16:25, Jed Brown <jed at jedbrown.org> a ?crit :
>
> Does the file load in paraview? When I load your *.msh in a tutorial with
> -dm_plex_filename sami.msh -dm_view vtk:sami.vtu, I get this good output.
>
> <sami.vtu><sami.png>
> Sami BEN ELHAJ SALAH <sami.ben-elhaj-salah at ensma.fr> writes:
>
> Hi Jed,
>
> Thank you for your answer.
>
> When I use a ??solution.vtu'', I obtain a wrong file.
>
> <?xml version="1.0"?>
> <VTKFile type="UnstructuredGrid" version="0.1" byte_order="LittleEndian">
> <UnstructuredGrid>
> <Piece NumberOfPoints="12" NumberOfCells="2">
> <Points>
> <DataArray type="Float64" Name="Position" NumberOfComponents="3"
> format="appended" offset="0" />
> </Points>
> <Cells>
> <DataArray type="Int32" Name="connectivity" NumberOfComponents="1"
> format="appended" offset="292" />
> <DataArray type="Int32" Name="offsets" NumberOfComponents="1"
> format="appended" offset="360" />
> <DataArray type="UInt8" Name="types" NumberOfComponents="1"
> format="appended" offset="372" />
> </Cells>
> <CellData>
> <DataArray type="Int32" Name="Rank" NumberOfComponents="1"
> format="appended" offset="378" />
> </CellData>
> <PointData>
> <DataArray type="Float64" Name="dU_x(null)" NumberOfComponents="3"
> format="appended" offset="390" />
> </PointData>
> </Piece>
> </UnstructuredGrid>
> <AppendedData encoding="raw">
> _ $@$@$@$@$@$@$@$@$@$@$@$@4@$@$@4@$@4 at 4@$@@
> ?p?O??=??sT?????sT????p?O??=??sT???=??sT????p?O??=??sT???=??sT???=?p?O??=??sT?????sT???=o
> _????? uP???? uP??o _????? uP???? uP??o _????? uP???? uP??o _????? uP????
> uP??b#???????333????333??_#??????
> ?333????333??b#??????(?333??'?333??a#???????333??>?333??
> </AppendedData>
> </VTKFile>
>
>
> If I understand your answer, to solve my problem, should just upgrade all
> my software ?
>
> Thanks,
> Sami,
>
>
> --
> Dr. Sami BEN ELHAJ SALAH
> Ing?nieur de Recherche (CNRS)
> Institut Pprime - ISAE - ENSMA
> Mobile: 06.62.51.26.74
> Email: sami.ben-elhaj-salah at ensma.fr <sami.ben-elhaj-salah at ensma.fr>
> www.samibenelhajsalah.com <
> https://samiben91.github.io/samibenelhajsalah/index.html>
>
>
>
> Le 8 juin 2022 ? 15:37, Jed Brown <jed at jedbrown.org> a ?crit :
>
> You're using pretty old versions of all software; I'd recommend upgrading.
> I recommend choosing the file name "solution.vtu" to use the modern
> (non-legacy) format. Does that work for you?
>
> Sami BEN ELHAJ SALAH <sami.ben-elhaj-salah at ensma.fr> writes:
>
> Dear Petsc Developer team,
>
> I solved a linear elastic problem in 3D using a DMPLEX. My system is
> converging, then I would like to write out my solution vector to a vtk file
> where I use unstructured mesh. Currently, I tried two algorithms and I have
> the same result.
>
> 1) Algorithm 1
> err = SNESSolve(_snes, bc_vec_test, solution);
> CHKERRABORT(FOX::Parallel::COMM_WORLD,err);
> PetscViewer vtk;
>
> PetscViewerVTKOpen(FOX::Parallel::COMM_WORLD,"solution.vtk",FILE_MODE_WRITE,&vtk);
>
> VecView(solution,vtk);
> PetscViewerDestroy(&vtk);
>
>
> 2) Algorithm 2
> err = SNESSolve(_snes, bc_vec_test, solution);
> CHKERRABORT(FOX::Parallel::COMM_WORLD,err);
> PetscViewer vtk;
> PetscViewerCreate(FOX::Parallel::COMM_WORLD, &vtk);
> PetscViewerSetType(vtk, PETSCVIEWERVTK);
> PetscViewerFileSetName(vtk, "sol.vtk");
> VecView(solution, vtk);
> PetscViewerDestroy(&vtk);
>
> The result seems correct except for the rotation order of the nodes (see
> the red lines on gmsh and vtk file respectively). Then, I visualized my vtk
> file with paraview, and I remarked that my geometry is not correct and not
> conserved when comparing it with my gmsh file. So, I didn?t understand why
> the rotation order of nodes is not conserved when saving my result to a vtk
> file?
>
> Other information used:
> - gmsh format 2.2
> - Vtk version: 7.1.1
> - Petsc version: 3.13/opt
>
> Below my two files gmsh and vtk:
>
> Gmsh file:
> $MeshFormat
> 2.2 0 8
> $EndMeshFormat
> $Nodes
> 12
> 1 0.0 10.0 10.0
> 2 0.0 0.0 10.0
> 3 0.0 0.0 0.0
> 4 0.0 10.0 0.0
> 5 10.0 10.0 10.0
> 6 10.0 0.0 10.0
> 7 10.0 0.0 0.0
> 8 10.0 10.0 0.0
> 9 20.0 10.0 10.0
> 10 20.0 0.0 10.0
> 11 20.0 0.0 0.0
> 12 20.0 10.0 0.0
> $EndNodes
> $Elements
> 2
> 1 5 2 68 60 1 2 3 4 5 6 7 8
> 2 5 2 68 60 5 6 7 8 9 10 11 12
> $EndElements
>
> Vtk file :
> # vtk DataFile Version 2.0
> Simplicial Mesh Example
> ASCII
> DATASET UNSTRUCTURED_GRID
> POINTS 12 double
> 0.000000e+00 1.000000e+01 1.000000e+01
> 0.000000e+00 0.000000e+00 1.000000e+01
> 0.000000e+00 0.000000e+00 0.000000e+00
> 0.000000e+00 1.000000e+01 0.000000e+00
> 1.000000e+01 1.000000e+01 1.000000e+01
> 1.000000e+01 0.000000e+00 1.000000e+01
> 1.000000e+01 0.000000e+00 0.000000e+00
> 1.000000e+01 1.000000e+01 0.000000e+00
> 2.000000e+01 1.000000e+01 1.000000e+01
> 2.000000e+01 0.000000e+00 1.000000e+01
> 2.000000e+01 0.000000e+00 0.000000e+00
> 2.000000e+01 1.000000e+01 0.000000e+00
> CELLS 2 18
> 8 0 3 2 1 4 5 6 7
> 8 4 7 6 5 8 9 10 11
> CELL_TYPES 2
> 12
> 12
> POINT_DATA 12
> VECTORS dU_x double
> 2.754808e-10 -8.653846e-11 -8.653846e-11
> 2.754808e-10 8.653846e-11 -8.653846e-11
> 2.754808e-10 8.653846e-11 8.653846e-11
> 2.754808e-10 -8.653846e-11 8.653846e-11
> 4.678571e-01 -9.107143e-02 -9.107143e-02
> 4.678571e-01 9.107143e-02 -9.107143e-02
> 4.678571e-01 9.107143e-02 9.107143e-02
> 4.678571e-01 -9.107143e-02 9.107143e-02
> 1.000000e+00 -7.500000e-02 -7.500000e-02
> 1.000000e+00 7.500000e-02 -7.500000e-02
> 1.000000e+00 7.500000e-02 7.500000e-02
> 1.000000e+00 -7.500000e-02 7.500000e-02
>
> Thank you in advance and have a good day !
>
> Sami,
>
> --
> Dr. Sami BEN ELHAJ SALAH
> Ing?nieur de Recherche (CNRS)
> Institut Pprime - ISAE - ENSMA
> Mobile: 06.62.51.26.74
> Email: sami.ben-elhaj-salah at ensma.fr
> www.samibenelhajsalah.com <
> https://samiben91.github.io/samibenelhajsalah/index.html>
>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220608/f388a714/attachment-0001.html>

From armand.touminet at protonmail.com  Thu Jun  9 10:25:02 2022
From: armand.touminet at protonmail.com (Armand Touminet)
Date: Thu, 09 Jun 2022 15:25:02 +0000
Subject: [petsc-users] VecConcatenate in petsc4py
Message-ID: <huHx8W4zT20O4c_k-1ZquJjCV10Obc75aahYx5kzDvroFr2VwrcsUMXeQr65P9kl7OUyvQs8gk-_riUjgbdqxZYBX14NQay-QdaGj6agetU=@protonmail.com>

Dear Petsc team,

I'm trying to implement PDE constrained optimization using TAO from the petsc4py interface.
Since my problem has multiple parameter fields to optimize, I need to combine them into a single Vec object to supply to TAO. I've found the VecConcatenate in the C documentation, which appears to do exactly what I need, however this function does not seem to exist in the python interface (or at least I was unable to find it).
Is there an other easy way to combine vectors from python?

Thanks for your help,

Armand Touminet
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220609/d79ea5a3/attachment.html>

From zjorti at lanl.gov  Thu Jun  9 16:19:51 2022
From: zjorti at lanl.gov (Jorti, Zakariae)
Date: Thu, 9 Jun 2022 21:19:51 +0000
Subject: [petsc-users] Question about SuperLU
Message-ID: <ca06f6e208114e67bd7b45d1af1fcd22@lanl.gov>

Hi,

I am solving non-linear problem that has 5 unknowns {ni, T, E, B, V}, and for the preconditioning part, I am using a FieldSplit preconditioner. At the last fieldsplit/level, we are left with a {B,V} block that tried to precondition in 2 different ways:
a) SuperLU:
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type preonly -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type lu -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_factor_mat_solver_type superlu_dist
b) a Schur-based fieldsplit preconditioner that uses SuperLU for both V and B blocks:
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type gmres -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type fieldsplit -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_type schur -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_schur_precondition selfp -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_ksp_type preonly -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_pc_type lu -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_pc_factor_mat_solver_type superlu_dist -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_ksp_type preonly -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_pc_type lu -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_pc_factor_mat_solver_type superlu_dist

Option a) yields the following error:
"     Linear fieldsplit_ni_ solve converged due to CONVERGED_ATOL iterations 0
        Linear fieldsplit_TEBV_fieldsplit_tau_ solve converged due to CONVERGED_RTOL iterations 1
          Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_EP_ solve converged due to CONVERGED_RTOL iterations 5
          Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ solve did not converge due to DIVERGED_PC_FAILED iterations 0
                         PC failed due to FACTOR_NUMERIC_ZEROPIVOT "
whereas options b) seems to be working well.
Is it possible that the SuperLU on the {V,B} block uses a reordering that introduces a zero pivot or could there be another explanation for this error?

Many thanks.
Best,

Zakariae
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220609/2c7f988d/attachment.html>

From jed at jedbrown.org  Thu Jun  9 16:36:47 2022
From: jed at jedbrown.org (Jed Brown)
Date: Thu, 09 Jun 2022 15:36:47 -0600
Subject: [petsc-users] VecConcatenate in petsc4py
In-Reply-To: <huHx8W4zT20O4c_k-1ZquJjCV10Obc75aahYx5kzDvroFr2VwrcsUMXeQr65P9kl7OUyvQs8gk-_riUjgbdqxZYBX14NQay-QdaGj6agetU=@protonmail.com>
References: <huHx8W4zT20O4c_k-1ZquJjCV10Obc75aahYx5kzDvroFr2VwrcsUMXeQr65P9kl7OUyvQs8gk-_riUjgbdqxZYBX14NQay-QdaGj6agetU=@protonmail.com>
Message-ID: <875yl9iknk.fsf@jedbrown.org>

You don't want to create a new vector here, but read from (and write to) multiple parts of the same vector. You can use PETSc interfaces for subvectors, or do it with NumPy slices (perhaps more natural and ergonomic, depending on how your code is written). 

Armand Touminet via petsc-users <petsc-users at mcs.anl.gov> writes:

> Dear Petsc team,
>
> I'm trying to implement PDE constrained optimization using TAO from the petsc4py interface.
> Since my problem has multiple parameter fields to optimize, I need to combine them into a single Vec object to supply to TAO. I've found the VecConcatenate in the C documentation, which appears to do exactly what I need, however this function does not seem to exist in the python interface (or at least I was unable to find it).
> Is there an other easy way to combine vectors from python?
>
> Thanks for your help,
>
> Armand Touminet

From xsli at lbl.gov  Thu Jun  9 19:28:13 2022
From: xsli at lbl.gov (Xiaoye S. Li)
Date: Thu, 9 Jun 2022 17:28:13 -0700
Subject: [petsc-users] Question about SuperLU
In-Reply-To: <ca06f6e208114e67bd7b45d1af1fcd22@lanl.gov>
References: <ca06f6e208114e67bd7b45d1af1fcd22@lanl.gov>
Message-ID: <CAFvbobVgvfsGW2gn=OsPLU==p-tf-dH6L-_nVo8PG4A4bEfXbg@mail.gmail.com>

Are you using serial SuperLU, or distributed-memory SuperLU_DIST?
What are the algorithm options are you using?

Sherry Li

On Thu, Jun 9, 2022 at 2:20 PM Jorti, Zakariae via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Hi,
>
> I am solving non-linear problem that has 5 unknowns {ni, T, E, B, V}, and
> for the preconditioning part, I am using a FieldSplit preconditioner. At
> the last fieldsplit/level, we are left with a {B,V} block that tried to
> precondition in 2 different ways:
> a) SuperLU:
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type preonly
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type lu
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_factor_mat_solver_type
> superlu_dist
> b) a Schur-based fieldsplit preconditioner that uses SuperLU for both V
> and B blocks:
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type gmres
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type fieldsplit
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_type schur
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_schur_precondition
> selfp -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_ksp_type
> preonly -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_pc_type
> lu
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_pc_factor_mat_solver_type
> superlu_dist
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_ksp_type preonly
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_pc_type lu
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_pc_factor_mat_solver_type
> superlu_dist
>
> Option a) yields the following error:
> "     Linear fieldsplit_ni_ solve converged due to CONVERGED_ATOL
> iterations 0
>         Linear fieldsplit_TEBV_fieldsplit_tau_ solve converged due to
> CONVERGED_RTOL iterations 1
>           Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_EP_ solve
> converged due to CONVERGED_RTOL iterations 5
>           Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ solve did
> not converge due to DIVERGED_PC_FAILED iterations 0
>                          PC failed due to FACTOR_NUMERIC_ZEROPIVOT "
> whereas options b) seems to be working well.
> Is it possible that the SuperLU on the {V,B} block uses a reordering that
> introduces a zero pivot or could there be another explanation for this
> error?
>
> Many thanks.
> Best,
>
> Zakariae
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220609/4b11bd3c/attachment.html>

From jsfaraway at gmail.com  Fri Jun 10 07:27:06 2022
From: jsfaraway at gmail.com (jsfaraway)
Date: Fri, 10 Jun 2022 20:27:06 +0800
Subject: [petsc-users] SLEPc EPSGD: too much time in single iteration
In-Reply-To: <FD844D29-1CB7-4E1D-AEC6-80F3DAB9F670@dsic.upv.es>
References: <4C029833-EC0F-494F-911F-D795375718D9@gmail.com>
	<FD844D29-1CB7-4E1D-AEC6-80F3DAB9F670@dsic.upv.es>
Message-ID: <6E7BB1A2-5E08-4C99-93EB-77D14B23B44E@gmail.com>

An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220610/6914238c/attachment-0001.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: log_view.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220610/6914238c/attachment-0001.txt>

From jroman at dsic.upv.es  Fri Jun 10 07:50:47 2022
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Fri, 10 Jun 2022 14:50:47 +0200
Subject: [petsc-users] SLEPc EPSGD: too much time in single iteration
In-Reply-To: <EDBC719A-0974-446B-ADD2-0F101CDB691F@getmailspring.com>
References: <FD844D29-1CB7-4E1D-AEC6-80F3DAB9F670@dsic.upv.es>
	<EDBC719A-0974-446B-ADD2-0F101CDB691F@getmailspring.com>
Message-ID: <0FE9EE1A-2CD4-44A2-ACD0-F6F4A466457A@dsic.upv.es>

The value -eps_ncv 5000 is huge.
Better let SLEPc use the default value.

Jose


> El 10 jun 2022, a las 14:24, Jin Runfeng <jsfaraway at gmail.com> escribi?:
> 
> Hello!
>  I want to acquire the 3 smallest eigenvalue, and attachment is the log  view output. I can see epssolve really cost the major time. But I can not see why it cost so much time. Can you see something from it?
> 
> Thank you !
> 
> Runfeng Jin
> 
> On 6? 4 2022, at 1:37 ??, Jose E. Roman <jroman at dsic.upv.es> wrote:
> Convergence depends on distribution of eigenvalues you want to compute. On the other hand, the cost also depends on the time it takes to build the preconditioner. Use -log_view to see the cost of the different steps of the computation.
> 
> Jose
> 
> 
> > El 3 jun 2022, a las 18:50, jsfaraway <jsfaraway at gmail.com> escribi?:
> >
> > hello!
> >
> > I am trying to use epsgd compute matrix's one smallest eigenvalue. And I find a strang thing. There are two matrix A(900000*900000) and B(90000*90000). While solve A use 371 iterations and only 30.83s, solve B use 22 iterations and 38885s! What could be the reason for this? Or what can I do to find the reason?
> >
> > I use" -eps_type gd -eps_ncv 300 -eps_nev 3 -eps_smallest_real ".
> > And there is one difference I can tell is matrix B has many small value, whose absolute value is less than 10-6. Could this be the reason?
> >
> > Thank you!
> >
> > Runfeng Jin
> <log_view.txt>


From bsmith at petsc.dev  Fri Jun 10 08:32:01 2022
From: bsmith at petsc.dev (Barry Smith)
Date: Fri, 10 Jun 2022 09:32:01 -0400
Subject: [petsc-users] Question about SuperLU
In-Reply-To: <ca06f6e208114e67bd7b45d1af1fcd22@lanl.gov>
References: <ca06f6e208114e67bd7b45d1af1fcd22@lanl.gov>
Message-ID: <D87220F6-DF2C-4375-B241-33423C14D185@petsc.dev>


  It is difficult to tell exactly how the preconditioner is being formed with the information below it looks like in the 

first case: the original B diagonal block and V diagonal block of the matrix are being factored separately with SuperLU_DIST 

second case: the B block is factored with SuperLU_DIST and an explicit approximation to a Schur complement of the V block (Schur complement on eliminating the B block) is formed using "Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses A00's %sdiagonal's inverse" (this is the printout from a KSPView() for this part of the preconditioner). 

  My guess is you have a "Stokes"-like problem where the V block is identically 0 so, of course, the SuperLU_DIST will fail on it. But the approximation of the Schur complement onto that block is not singular so SuperLU_DIST has no trouble. If I am wrong and the V block is not identically 0 then it may be singular (or possibly, but less likely just badly order) so that SuperLU_DIST encounters a zero pivot.

  You can run with -ksp_view_pre to have the KSP print the KSP solver algorithm details BEFORE the linear solve (hence they would get printed despite your failed solve). That would be useful to see exactly what your preconditioner is.

   You can use -ksp_view_pmat (with appropriate prefix) to display the matrix that is going to be factored. Thus you can quickly verify what V is.

  If you run with -ksp_error_if_not_converged then the solver will stop exactly when the zero pivot is encountered; this would include some information from SuperLU_DIST which might include the row number etc. 

  Notes on PETSc improvements needed. 

1) The man page for KSPCheckSolve() is terribly misleading

2) It would be nice to have a view that displayed the nested fieldsplit preconditioners more clearly






> On Jun 9, 2022, at 5:19 PM, Jorti, Zakariae via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Hi, 
> 
> I am solving non-linear problem that has 5 unknowns {ni, T, E, B, V}, and for the preconditioning part, I am using a FieldSplit preconditioner. At the last fieldsplit/level, we are left with a {B,V} block that tried to precondition in 2 different ways: 
> a) SuperLU: 
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type preonly -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type lu -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_factor_mat_solver_type superlu_dist  
> b) a Schur-based fieldsplit preconditioner that uses SuperLU for both V and B blocks:
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type gmres -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type fieldsplit -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_type schur -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_schur_precondition selfp -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_ksp_type preonly -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_pc_type lu -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_pc_factor_mat_solver_type superlu_dist -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_ksp_type preonly -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_pc_type lu -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_pc_factor_mat_solver_type superlu_dist
> 
> Option a) yields the following error: 
> "     Linear fieldsplit_ni_ solve converged due to CONVERGED_ATOL iterations 0
>         Linear fieldsplit_TEBV_fieldsplit_tau_ solve converged due to CONVERGED_RTOL iterations 1
>           Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_EP_ solve converged due to CONVERGED_RTOL iterations 5
>           Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ solve did not converge due to DIVERGED_PC_FAILED iterations 0
>                          PC failed due to FACTOR_NUMERIC_ZEROPIVOT " 
> whereas options b) seems to be working well. 
> Is it possible that the SuperLU on the {V,B} block uses a reordering that introduces a zero pivot or could there be another explanation for this error?
> 
> Many thanks. 
> Best,
> 
> Zakariae 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220610/4be933f0/attachment.html>

From knepley at gmail.com  Fri Jun 10 09:11:11 2022
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 10 Jun 2022 10:11:11 -0400
Subject: [petsc-users] Question about SuperLU
In-Reply-To: <ca06f6e208114e67bd7b45d1af1fcd22@lanl.gov>
References: <ca06f6e208114e67bd7b45d1af1fcd22@lanl.gov>
Message-ID: <CAMYG4GkmaipKLj7aLNc=QdkM1Jd21NoMXX3-Ld==STypwaRiwQ@mail.gmail.com>

On Thu, Jun 9, 2022 at 5:20 PM Jorti, Zakariae via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Hi,
>
> I am solving non-linear problem that has 5 unknowns {ni, T, E, B, V}, and
> for the preconditioning part, I am using a FieldSplit preconditioner. At
> the last fieldsplit/level, we are left with a {B,V} block that tried to
> precondition in 2 different ways:
> a) SuperLU:
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type preonly
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type lu
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_factor_mat_solver_type
> superlu_dist
> b) a Schur-based fieldsplit preconditioner that uses SuperLU for both V
> and B blocks:
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type gmres
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type fieldsplit
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_type schur
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_schur_precondition
> selfp -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_ksp_type
> preonly -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_pc_type
> lu
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_pc_factor_mat_solver_type
> superlu_dist
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_ksp_type preonly
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_pc_type lu
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_pc_factor_mat_solver_type
> superlu_dist
>
> Option a) yields the following error:
> "     Linear fieldsplit_ni_ solve converged due to CONVERGED_ATOL
> iterations 0
>         Linear fieldsplit_TEBV_fieldsplit_tau_ solve converged due to
> CONVERGED_RTOL iterations 1
>           Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_EP_ solve
> converged due to CONVERGED_RTOL iterations 5
>           Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ solve did
> not converge due to DIVERGED_PC_FAILED iterations 0
>                          PC failed due to FACTOR_NUMERIC_ZEROPIVOT "
> whereas options b) seems to be working well.
> Is it possible that the SuperLU on the {V,B} block uses a reordering that
> introduces a zero pivot or could there be another explanation for this
> error?
>

I can at least come up with a case where this is true. Suppose you have

  / A 0 \
  \ 0  I /

where A is rank deficient, but has a positive diagonal. SuperLU will fail
since it is actually singular. However, your Schur complement might work
since you use
'selfp' for the Schur preconditioner, and it just extracts the diagonal.

  Thanks,

     Matt


> Many thanks.
> Best,
>
> Zakariae
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220610/c4a54969/attachment-0001.html>

From tangqi at msu.edu  Fri Jun 10 09:43:43 2022
From: tangqi at msu.edu (Tang, Qi)
Date: Fri, 10 Jun 2022 14:43:43 +0000
Subject: [petsc-users] Question about SuperLU
Message-ID: <451BCBEE-FCC1-44D2-946F-4AF80403E6A9@msu.edu>

?We use superlu_dist.

We have a 2 x 2 block where directly calling suplerlu_dist fails, but a pc based on a fieldsplit Schur complement + superlu_dist on the assembled Schur complement matrix converges. (All the converge criteria are default at this level)

I am having a hard time to understand what is going on. The B,V block is of size 240K, so it is also hard to analyze. And the mat is not something we explicitly formed. It is formed by finite difference coloring Jacobian + a few levels of Schur complement.

  / A 0 \
  \ 0  I /
Matt, I do not see this can explain why the second pc with superlu on S = A would succeed, if A is not full rank.

I believe I found somewhere it says petsc?s pclu (or maybe superlu_dist) did reordering and it may introduce 0 pivoting. We are asking because it seems there is something we do not understand from pclu/superlu level.

Anyway, is there a way to output the mat before it fails? We have been struggling to do that. We have TSSolve->SNES->FDColoringJacobian->A few levels of fieldsplit->failed Subblock matrix, which we want to analyze. (Sometimes it even happens in the second Newton iteration as the first one works okay.)

Qi



On Jun 10, 2022, at 8:11 AM, Matthew Knepley <knepley at gmail.com> wrote:

?
On Thu, Jun 9, 2022 at 5:20 PM Jorti, Zakariae via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:

Hi,

I am solving non-linear problem that has 5 unknowns {ni, T, E, B, V}, and for the preconditioning part, I am using a FieldSplit preconditioner. At the last fieldsplit/level, we are left with a {B,V} block that tried to precondition in 2 different ways:
a) SuperLU:
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type preonly -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type lu -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_factor_mat_solver_type superlu_dist
b) a Schur-based fieldsplit preconditioner that uses SuperLU for both V and B blocks:
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type gmres -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type fieldsplit -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_type schur -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_schur_precondition selfp -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_ksp_type preonly -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_pc_type lu -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_pc_factor_mat_solver_type superlu_dist -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_ksp_type preonly -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_pc_type lu -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_pc_factor_mat_solver_type superlu_dist

Option a) yields the following error:
"     Linear fieldsplit_ni_ solve converged due to CONVERGED_ATOL iterations 0
        Linear fieldsplit_TEBV_fieldsplit_tau_ solve converged due to CONVERGED_RTOL iterations 1
          Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_EP_ solve converged due to CONVERGED_RTOL iterations 5
          Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ solve did not converge due to DIVERGED_PC_FAILED iterations 0
                         PC failed due to FACTOR_NUMERIC_ZEROPIVOT "
whereas options b) seems to be working well.
Is it possible that the SuperLU on the {V,B} block uses a reordering that introduces a zero pivot or could there be another explanation for this error?

I can at least come up with a case where this is true. Suppose you have

  / A 0 \
  \ 0  I /

where A is rank deficient, but has a positive diagonal. SuperLU will fail since it is actually singular. However, your Schur complement might work since you use
'selfp' for the Schur preconditioner, and it just extracts the diagonal.

  Thanks,

     Matt

Many thanks.
Best,

Zakariae


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<https://urldefense.com/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!HXCxUKc!3sg7TZfA3Z8m6foZ2wvm-LDZ6jI9C-kp1FVdsCbdXrDj--rBHyrWkz0akiZApgTAMNYce0yg5uta7w$>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220610/563eb67b/attachment.html>

From patrick.sanan at gmail.com  Fri Jun 10 11:54:30 2022
From: patrick.sanan at gmail.com (Patrick Sanan)
Date: Fri, 10 Jun 2022 18:54:30 +0200
Subject: [petsc-users] Mat created by DMStag cannot access ghost points
In-Reply-To: <CAMYG4GkJ1bPSN0XGdOL7+rJfFzQJPO-GKzf7hBAsKVCHSZKGpw@mail.gmail.com>
References: <MEYP282MB3261A6AEAA233ED03745F84AEDDC9@MEYP282MB3261.AUSP282.PROD.OUTLOOK.COM>
	<CAJ98EDpwva_aN9vcvpLwFRc66PxKcm0vfiDP4iwe0tWsFDzsrw@mail.gmail.com>
	<MEYP282MB3261C82D8A3B37A5CC9402BAEDDF9@MEYP282MB3261.AUSP282.PROD.OUTLOOK.COM>
	<859FA50E-F3C2-4E54-AEDC-7C8A70D3FCE1@petsc.dev>
	<CA+z91Td9Fd+1A9QOaR1JeJ=MS1dj9EvDddZMGaPMHKO=sorEQA@mail.gmail.com>
	<CAMYG4GkJ1bPSN0XGdOL7+rJfFzQJPO-GKzf7hBAsKVCHSZKGpw@mail.gmail.com>
Message-ID: <CA+z91TeOC91CqrG_+2Wq9k+ki=k42ecUg75kNDoeGb_t+-o34g@mail.gmail.com>

Sorry about the long delay on this.
https://gitlab.com/petsc/petsc/-/merge_requests/5329




Am Do., 2. Juni 2022 um 15:01 Uhr schrieb Matthew Knepley <knepley at gmail.com
>:

> On Thu, Jun 2, 2022 at 8:59 AM Patrick Sanan <patrick.sanan at gmail.com>
> wrote:
>
>> Thanks, Barry and Changqing! That seems reasonable to me, so I'll make an
>> MR with that change.
>>
>
> Hi Patrick,
>
> In the MR, could you add that option to all places we internally use
> Preallocator? I think we mean it for those.
>
>   Thanks,
>
>      Matt
>
>
>> Am Mi., 1. Juni 2022 um 20:06 Uhr schrieb Barry Smith <bsmith at petsc.dev>:
>>
>>>
>>>   This appears to be a bug in the DMStag/Mat preallocator code. If you
>>> add after the DMCreateMatrix() line in your code
>>>
>>> PetscCall(MatSetOption(A, MAT_NO_OFF_PROC_ENTRIES, PETSC_FALSE));
>>>
>>> Your code will run correctly.
>>>
>>>   Patrick and Matt,
>>>
>>>   MatPreallocatorPreallocate_Preallocator() has
>>>
>>> PetscCall(MatSetOption(A, MAT_NO_OFF_PROC_ENTRIES, p->nooffproc));
>>>
>>> to make the assembly of the stag matrix from the preallocator matrix a
>>> little faster,
>>>
>>> but then it never "undoes" this call. Hence the matrix is left in the
>>> state where it will error if someone sets values from a different rank
>>> (which they certainly can using DMStagMatSetValuesStencil().
>>>
>>>  I think you need to clear the NO_OFF_PROC at the end
>>> of MatPreallocatorPreallocate_Preallocator() because just because the
>>> preallocation process never needed communication does not mean that when
>>> someone puts real values in the matrix they will never use communication;
>>> they can put in values any dang way they please.
>>>
>>> I don't know why this bug has not come up before.
>>>
>>>   Barry
>>>
>>>
>>> On May 31, 2022, at 11:08 PM, Ye Changqing <Ye_Changqing at outlook.com>
>>> wrote:
>>>
>>> Dear all,
>>>
>>> [BugReport.c] is a sample code, [BugReportParallel.output] is the output
>>> when execute BugReport with mpiexec, [BugReportSerial.output] is the output
>>> in serial execution.
>>>
>>> Best,
>>> Changqing
>>>
>>> ------------------------------
>>> *???:* Dave May <dave.mayhem23 at gmail.com>
>>> *????:* 2022?5?31? 22:55
>>> *???:* Ye Changqing <Ye_Changqing at outlook.com>
>>> *??:* petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
>>> *??:* Re: [petsc-users] Mat created by DMStag cannot access ghost points
>>>
>>>
>>>
>>> On Tue 31. May 2022 at 16:28, Ye Changqing <Ye_Changqing at outlook.com>
>>> wrote:
>>>
>>> Dear developers of PETSc,
>>>
>>> I encountered a problem when using the DMStag module. The program could
>>> be executed perfectly in serial, while errors are thrown out in parallel
>>> (using mpiexec). Some rows in Mat cannot be accessed in local processes
>>> when looping all elements in DMStag. The DM object I used only has one DOF
>>> in each element. Hence, I could switch to the DMDA module easily, and the
>>> program now is back to normal.
>>>
>>> Some snippets are below.
>>>
>>> Initialise a DMStag object:
>>> PetscCall(DMStagCreate2d(PETSC_COMM_WORLD, DM_BOUNDARY_NONE,
>>> DM_BOUNDARY_NONE, M, N, PETSC_DECIDE, PETSC_DECIDE, 0, 0, 1,
>>> DMSTAG_STENCIL_BOX, 1, NULL, NULL, &(s_ctx->dm_P)));
>>> Created a Mat:
>>> PetscCall(DMCreateMatrix(s_ctx->dm_P, A));
>>> Loop:
>>> PetscCall(DMStagGetCorners(s_ctx->dm_V, &startx, &starty, &startz, &nx,
>>> &ny, &nz, &extrax, &extray, &extraz));
>>> for (ey = starty; ey < starty + ny; ++ey)
>>> for (ex = startx; ex < startx + nx; ++ex)
>>> {
>>> ...
>>> PetscCall(DMStagMatSetValuesStencil(s_ctx->dm_P, *A, 2, &row[0], 2,
>>> &col[0], &val_A[0][0], ADD_VALUES));  // The traceback shows the problem is
>>> in here.
>>> }
>>>
>>>
>>> In addition to the code or MWE, please forward us the complete stack
>>> trace / error thrown to stdout.
>>>
>>> Thanks,
>>> Dave
>>>
>>>
>>>
>>> Best,
>>> Changqing
>>>
>>> <BugReport.c><BugReportParallel.output><BugReportSerial.output>
>>>
>>>
>>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220610/706ce051/attachment-0001.html>

From 100442268 at alumnos.uc3m.es  Fri Jun 10 10:47:38 2022
From: 100442268 at alumnos.uc3m.es (NILTON SANTOS VALDIVIA)
Date: Fri, 10 Jun 2022 17:47:38 +0200
Subject: [petsc-users] Load a Sparse Matrix from a MAT file in PETSC
Message-ID: <CAP7yJcx5PWtmZaBWy1N6qV=SAZh8aXadEAqmHmMHhh-B79uyLQ@mail.gmail.com>

Hello,

I was trying to load a sparse matrix from a .MAT file  (and solve the
linear system). But even though, I have extracted  the A and b matrix and
vector from this suite *https://sparse.tamu.edu/FEMLAB/waveguide3D
<https://sparse.tamu.edu/FEMLAB/waveguide3D>* and saved the variables  (A,
b) in a MAT file as PETSC recommends, I couldn't be able to load the
matrix. Is there something that I was doing wrong? or do you know a way to
load a matrix  from the above link?

Actually I was using src/ksp/ksp/tutorials/ex10.c.html
<https://petsc.org/main/src/ksp/ksp/tutorials/ex10.c.html> example to try
to load a .MAT file (containing A and b) without success.


Best Regards,

NILTON SANTOS
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220610/ab3b9c6b/attachment.html>

From knepley at gmail.com  Fri Jun 10 12:57:02 2022
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 10 Jun 2022 13:57:02 -0400
Subject: [petsc-users] Load a Sparse Matrix from a MAT file in PETSC
In-Reply-To: <CAP7yJcx5PWtmZaBWy1N6qV=SAZh8aXadEAqmHmMHhh-B79uyLQ@mail.gmail.com>
References: <CAP7yJcx5PWtmZaBWy1N6qV=SAZh8aXadEAqmHmMHhh-B79uyLQ@mail.gmail.com>
Message-ID: <CAMYG4GkNjs+K3yE698_oNP4JdKK6EaA5iCoR81LmwPB2G+VkHQ@mail.gmail.com>

On Fri, Jun 10, 2022 at 1:15 PM NILTON SANTOS VALDIVIA <
100442268 at alumnos.uc3m.es> wrote:

> Hello,
>
> I was trying to load a sparse matrix from a .MAT file  (and solve the
> linear system). But even though, I have extracted  the A and b matrix and
> vector from this suite *https://sparse.tamu.edu/FEMLAB/waveguide3D
> <https://sparse.tamu.edu/FEMLAB/waveguide3D>* and saved the variables
> (A, b) in a MAT file as PETSC recommends, I couldn't be able to load the
> matrix. Is there something that I was doing wrong? or do you know a way to
> load a matrix  from the above link?
>
> Actually I was using src/ksp/ksp/tutorials/ex10.c.html
> <https://petsc.org/main/src/ksp/ksp/tutorials/ex10.c.html> example to try
> to load a .MAT file (containing A and b) without success.
>

Do you want to load a Matrix Market format matrix and vector? If so, this
is what https://petsc.org/main/src/mat/tests/ex72.c.html does.

  Thanks,

     Matt


>
> Best Regards,
>
> NILTON SANTOS
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220610/293d061a/attachment.html>

From 100442268 at alumnos.uc3m.es  Fri Jun 10 14:58:48 2022
From: 100442268 at alumnos.uc3m.es (NILTON SANTOS VALDIVIA)
Date: Fri, 10 Jun 2022 21:58:48 +0200
Subject: [petsc-users] Load a Sparse Matrix from a MAT file in PETSC
In-Reply-To: <CAMYG4GkNjs+K3yE698_oNP4JdKK6EaA5iCoR81LmwPB2G+VkHQ@mail.gmail.com>
References: <CAP7yJcx5PWtmZaBWy1N6qV=SAZh8aXadEAqmHmMHhh-B79uyLQ@mail.gmail.com>
	<CAMYG4GkNjs+K3yE698_oNP4JdKK6EaA5iCoR81LmwPB2G+VkHQ@mail.gmail.com>
Message-ID: <CAP7yJcy69C9G80rU-65=gnGc4V1557FBid7ju78L56qpiST-Ng@mail.gmail.com>

Hello Matthew,

Thank you very much for your answer. What I'm trying to do is to solve a
linear system using the A and b (matrix and vector) provided in
https://sparse.tamu.edu/FEMLAB/waveguide3D (no matter if its a .mat or a
.mtx file) and also I want to solve the system with multiple MPI processes,
I'm not an expert on it, just starting to understand the procedure. I'd
really appreciate if you could help me with this exercise.


Best Regards,

NILTON SANTOS


El vie, 10 jun 2022 a las 19:57, Matthew Knepley (<knepley at gmail.com>)
escribi?:

> On Fri, Jun 10, 2022 at 1:15 PM NILTON SANTOS VALDIVIA <
> 100442268 at alumnos.uc3m.es> wrote:
>
>> Hello,
>>
>> I was trying to load a sparse matrix from a .MAT file  (and solve the
>> linear system). But even though, I have extracted  the A and b matrix and
>> vector from this suite *https://sparse.tamu.edu/FEMLAB/waveguide3D
>> <https://sparse.tamu.edu/FEMLAB/waveguide3D>* and saved the variables
>> (A, b) in a MAT file as PETSC recommends, I couldn't be able to load the
>> matrix. Is there something that I was doing wrong? or do you know a way to
>> load a matrix  from the above link?
>>
>> Actually I was using src/ksp/ksp/tutorials/ex10.c.html
>> <https://petsc.org/main/src/ksp/ksp/tutorials/ex10.c.html> example to
>> try to load a .MAT file (containing A and b) without success.
>>
>
> Do you want to load a Matrix Market format matrix and vector? If so, this
> is what https://petsc.org/main/src/mat/tests/ex72.c.html does.
>
>   Thanks,
>
>      Matt
>
>
>>
>> Best Regards,
>>
>> NILTON SANTOS
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220610/8a8cd41b/attachment.html>

From bourdin at mcmaster.ca  Fri Jun 10 15:06:22 2022
From: bourdin at mcmaster.ca (Blaise Bourdin)
Date: Fri, 10 Jun 2022 20:06:22 +0000
Subject: [petsc-users] List of points with dof>0 in a PetscSection
Message-ID: <CDA10C09-5EB3-4DA0-AE64-D2003B3B661F@mcmaster.ca>

Hi,

Given a PetscSection, is there an easy way to get a list of point at which the number of dof is >0?
For instance, when projecting over a FE space, I?d rather do a loop over such points than do a loop over all points in a DM, get the number of dof, and test if it is >0.

Regards,
Blaise

-- 
Professor, Department of Mathematics & Statistics
Hamilton Hall room 409A, McMaster University
1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada 
https://www.math.mcmaster.ca/bourdin | +1 (905) 525 9140 ext. 27243


From kkhedkar9879 at sdsu.edu  Fri Jun 10 15:14:30 2022
From: kkhedkar9879 at sdsu.edu (Kaustubh Khedkar)
Date: Fri, 10 Jun 2022 13:14:30 -0700
Subject: [petsc-users] Error with PetscMatlabEngineCreate
Message-ID: <EB3C483A-7482-45A1-AAE2-D15716AD95F2@sdsu.edu>

Hi all,

I am using the Petsc?s Matlab engine to run some Matlab scripts from my c++ code.

I have been using Petsc?s Github commit: 9babe2dd5ff256baf1aab74d81ff9ed4c6baba0b 
(HEAD -> master, origin/master, origin/HEAD)
Merge: e9b74a6d12 bb2d6e605a
Author: Satish Balay <balay at mcs.anl.gov <mailto:balay at mcs.anl.gov>>
Date:   Fri Nov 6 17:46:10 2020 +0000

I used the command:

PetscMatlabEngineCreate(PETSC_COMM_SELF, "master", &(mengine));

where, the hostname is master. (Verified by typing hostname in the terminal)

Everything was working fine until I updated my PETSc version to 3.17.2. 
Using this version I get error using the command:

PetscMatlabEngineEvaluate(mengine, "load_parameters;?);

cannot read load_parameters script.  where, load_parameters is a Matlab script.

When I switch the hostname to NULL from master as:

PetscMatlabEngineCreate(PETSC_COMM_SELF, NULL, &(mengine));

Everything starts working fine again. All of this was executed on the same machine.

Has anything changed when using the PetscMatlabEngineEvaluate command?


Thank you,
Kaustubh Khedkar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220610/fbe30587/attachment.html>

From bsmith at petsc.dev  Fri Jun 10 15:47:29 2022
From: bsmith at petsc.dev (Barry Smith)
Date: Fri, 10 Jun 2022 16:47:29 -0400
Subject: [petsc-users] Error with PetscMatlabEngineCreate
In-Reply-To: <EB3C483A-7482-45A1-AAE2-D15716AD95F2@sdsu.edu>
References: <EB3C483A-7482-45A1-AAE2-D15716AD95F2@sdsu.edu>
Message-ID: <8091FE26-A225-4461-BC7D-1E5362D65DD6@petsc.dev>


  Based on your report the issue is likely due to a MATLABPATH issue. The difference between using "master" and NULL is that when "master" is used, PETSc ssh's to "master" to startup the Matlab engine while with NULL it launches the Matlab engine directly from the current process in (presumably) the current directory. 

   When ssh is used it does not have information about the current directory nor would it have any tweaks you have made to your MATLABPATH in your shell. Thus it cannot find the script.

   Thus when not using NULL you need to make sure that all scripts you plan to launch are findable in the MATLABPATH on the machine you are launching the Matlab engine, maybe by putting the directories in MATLABPATH on in your .bashrc or .profile file or whatever file gets sourced automatically when you ssh to master.

  Barry

 I have no explanation why the behavior would change with PETSc versions or Matlab versions but the above should resolve the problem; you may have previously just been "lucky" it could fine the script.



> On Jun 10, 2022, at 4:14 PM, Kaustubh Khedkar via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Hi all,
> 
> I am using the Petsc?s Matlab engine to run some Matlab scripts from my c++ code.
> 
> I have been using Petsc?s Github commit: 9babe2dd5ff256baf1aab74d81ff9ed4c6baba0b 
> (HEAD -> master, origin/master, origin/HEAD)
> Merge: e9b74a6d12 bb2d6e605a
> Author: Satish Balay <balay at mcs.anl.gov <mailto:balay at mcs.anl.gov>>
> Date:   Fri Nov 6 17:46:10 2020 +0000
> 
> I used the command:
> 
> PetscMatlabEngineCreate(PETSC_COMM_SELF, "master", &(mengine));
> 
> where, the hostname is master. (Verified by typing hostname in the terminal)
> 
> Everything was working fine until I updated my PETSc version to 3.17.2. 
> Using this version I get error using the command:
> 
> PetscMatlabEngineEvaluate(mengine, "load_parameters;?);
> 
> cannot read load_parameters script.  where, load_parameters is a Matlab script.
> 
> When I switch the hostname to NULL from master as:
> 
> PetscMatlabEngineCreate(PETSC_COMM_SELF, NULL, &(mengine));
> 
> Everything starts working fine again. All of this was executed on the same machine.
> 
> Has anything changed when using the PetscMatlabEngineEvaluate command?
> 
> 
> Thank you,
> Kaustubh Khedkar

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220610/18620228/attachment.html>

From knepley at gmail.com  Fri Jun 10 17:14:07 2022
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 10 Jun 2022 18:14:07 -0400
Subject: [petsc-users] Load a Sparse Matrix from a MAT file in PETSC
In-Reply-To: <CAP7yJcy69C9G80rU-65=gnGc4V1557FBid7ju78L56qpiST-Ng@mail.gmail.com>
References: <CAP7yJcx5PWtmZaBWy1N6qV=SAZh8aXadEAqmHmMHhh-B79uyLQ@mail.gmail.com>
	<CAMYG4GkNjs+K3yE698_oNP4JdKK6EaA5iCoR81LmwPB2G+VkHQ@mail.gmail.com>
	<CAP7yJcy69C9G80rU-65=gnGc4V1557FBid7ju78L56qpiST-Ng@mail.gmail.com>
Message-ID: <CAMYG4Gkbx18oZve0PHhO3zYpJ7=AXLnpv=VOdYvJbZL2mLXGHg@mail.gmail.com>

On Fri, Jun 10, 2022 at 3:59 PM NILTON SANTOS VALDIVIA <
100442268 at alumnos.uc3m.es> wrote:

> Hello Matthew,
>
> Thank you very much for your answer. What I'm trying to do is to solve a
> linear system using the A and b (matrix and vector) provided in
> https://sparse.tamu.edu/FEMLAB/waveguide3D (no matter if its a .mat or a
> .mtx file) and also I want to solve the system with multiple MPI processes,
> I'm not an expert on it, just starting to understand the procedure. I'd
> really appreciate if you could help me with this exercise.
>

I would:

  1) Download the Matrix Market format

  2) Use Mat test ex72 to read that matrix + vector and output them in
PETSc binary format

  3) Use KSP ex10 to read the PETSc binary format and test your solver

  Thanks,

     Matt


> Best Regards,
>
> NILTON SANTOS
>
>
> El vie, 10 jun 2022 a las 19:57, Matthew Knepley (<knepley at gmail.com>)
> escribi?:
>
>> On Fri, Jun 10, 2022 at 1:15 PM NILTON SANTOS VALDIVIA <
>> 100442268 at alumnos.uc3m.es> wrote:
>>
>>> Hello,
>>>
>>> I was trying to load a sparse matrix from a .MAT file  (and solve the
>>> linear system). But even though, I have extracted  the A and b matrix and
>>> vector from this suite *https://sparse.tamu.edu/FEMLAB/waveguide3D
>>> <https://sparse.tamu.edu/FEMLAB/waveguide3D>* and saved the variables
>>> (A, b) in a MAT file as PETSC recommends, I couldn't be able to load the
>>> matrix. Is there something that I was doing wrong? or do you know a way to
>>> load a matrix  from the above link?
>>>
>>> Actually I was using src/ksp/ksp/tutorials/ex10.c.html
>>> <https://petsc.org/main/src/ksp/ksp/tutorials/ex10.c.html> example to
>>> try to load a .MAT file (containing A and b) without success.
>>>
>>
>> Do you want to load a Matrix Market format matrix and vector? If so, this
>> is what https://petsc.org/main/src/mat/tests/ex72.c.html does.
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>>
>>> Best Regards,
>>>
>>> NILTON SANTOS
>>>
>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220610/89cb6079/attachment-0001.html>

From zjorti at lanl.gov  Fri Jun 10 18:30:59 2022
From: zjorti at lanl.gov (Jorti, Zakariae)
Date: Fri, 10 Jun 2022 23:30:59 +0000
Subject: [petsc-users] [EXTERNAL] Re:  Question about SuperLU
In-Reply-To: <D87220F6-DF2C-4375-B241-33423C14D185@petsc.dev>
References: <ca06f6e208114e67bd7b45d1af1fcd22@lanl.gov>,
	<D87220F6-DF2C-4375-B241-33423C14D185@petsc.dev>
Message-ID: <f0a14efe1f974eea957cf8a63723b4cd@lanl.gov>

Hi,


Thank you all for your answers.
I have tried your suggestions and here is what I found.
Barry you were right about the first case. But in the second case, I am not using a Schur fieldsplit but a multiplicative fieldsplit : -fieldsplit_TEBV_fieldsplit_EBV_pc_type fieldsplit -fieldsplit_TEBV_fieldsplit_EBV_pc_fieldsplit_type multiplicative

So, there should not be any Schur complement approximation Sp.

When I ran a test with -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_view_pre -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_error_if_not_converged, I got this error:


    0 SNES Function norm 6.368031218939e-02
      0 KSP Residual norm 6.368031218939e-02
      Linear fieldsplit_ni_ solve converged due to CONVERGED_ATOL iterations 0
        Linear fieldsplit_TEBV_fieldsplit_tau_ solve converged due to CONVERGED_RTOL iterations 1
          Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_EP_ solve converged due to CONVERGED_RTOL iterations 3
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Zero pivot in LU factorization: https://petsc.org/release/faq/#zeropivot
[0]PETSC ERROR: Zero pivot in row 1658
[0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[0]PETSC ERROR: Petsc Development GIT revision: v3.16.3-751-g2f43bd9bc3  GIT Date: 2022-01-26 22:34:02 -0600
[0]PETSC ERROR: ./main on a macx named pn2032683.lanl.gov by zjorti Fri Jun 10 16:17:35 2022
[0]PETSC ERROR: Configure options PETSC_ARCH=macx --with-fc=0 --with-mpi-dir=/Users/zjorti/.brew --download-hypre --with-debugging=0 --with-cxx-dialect=C++11 --download-superlu_dist --download-parmetis --download-metis --download-ptscotch --download-cmake



Then I tried this flag -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_view_pmat binary:BVmat:binary_matlab and checked the resulting matrix in Matlab.
For Matlab, this is a full rank matrix, and the LU factorization there was carried out without any issues.

I also outputted the BV block directly from the Jacobian matrix.
Once again, according to Matlab, it is a full rank matrix and it computes the LU factorization without any problem.


________________________________
From: Barry Smith <bsmith at petsc.dev>
Sent: Friday, June 10, 2022 7:32 AM
To: Jorti, Zakariae
Cc: petsc-users at mcs.anl.gov; Tang, Xianzhu
Subject: [EXTERNAL] Re: [petsc-users] Question about SuperLU


  It is difficult to tell exactly how the preconditioner is being formed with the information below it looks like in the

first case: the original B diagonal block and V diagonal block of the matrix are being factored separately with SuperLU_DIST

second case: the B block is factored with SuperLU_DIST and an explicit approximation to a Schur complement of the V block (Schur complement on eliminating the B block) is formed using "Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses A00's %sdiagonal's inverse" (this is the printout from a KSPView() for this part of the preconditioner).

  My guess is you have a "Stokes"-like problem where the V block is identically 0 so, of course, the SuperLU_DIST will fail on it. But the approximation of the Schur complement onto that block is not singular so SuperLU_DIST has no trouble. If I am wrong and the V block is not identically 0 then it may be singular (or possibly, but less likely just badly order) so that SuperLU_DIST encounters a zero pivot.

  You can run with -ksp_view_pre to have the KSP print the KSP solver algorithm details BEFORE the linear solve (hence they would get printed despite your failed solve). That would be useful to see exactly what your preconditioner is.

   You can use -ksp_view_pmat (with appropriate prefix) to display the matrix that is going to be factored. Thus you can quickly verify what V is.

  If you run with -ksp_error_if_not_converged then the solver will stop exactly when the zero pivot is encountered; this would include some information from SuperLU_DIST which might include the row number etc.

  Notes on PETSc improvements needed.

1) The man page for KSPCheckSolve() is terribly misleading

2) It would be nice to have a view that displayed the nested fieldsplit preconditioners more clearly






On Jun 9, 2022, at 5:19 PM, Jorti, Zakariae via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:


Hi,

I am solving non-linear problem that has 5 unknowns {ni, T, E, B, V}, and for the preconditioning part, I am using a FieldSplit preconditioner. At the last fieldsplit/level, we are left with a {B,V} block that tried to precondition in 2 different ways:
a) SuperLU:
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type preonly -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type lu -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_factor_mat_solver_type superlu_dist
b) a Schur-based fieldsplit preconditioner that uses SuperLU for both V and B blocks:
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type gmres -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type fieldsplit -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_type schur -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_schur_precondition selfp -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_ksp_type preonly -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_pc_type lu -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_pc_factor_mat_solver_type superlu_dist -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_ksp_type preonly -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_pc_type lu -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_pc_factor_mat_solver_type superlu_dist

Option a) yields the following error:
"     Linear fieldsplit_ni_ solve converged due to CONVERGED_ATOL iterations 0
        Linear fieldsplit_TEBV_fieldsplit_tau_ solve converged due to CONVERGED_RTOL iterations 1
          Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_EP_ solve converged due to CONVERGED_RTOL iterations 5
          Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ solve did not converge due to DIVERGED_PC_FAILED iterations 0
                         PC failed due to FACTOR_NUMERIC_ZEROPIVOT "
whereas options b) seems to be working well.
Is it possible that the SuperLU on the {V,B} block uses a reordering that introduces a zero pivot or could there be another explanation for this error?

Many thanks.
Best,

Zakariae

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220610/a98110bc/attachment.html>

From kaustubh23593 at gmail.com  Fri Jun 10 14:11:33 2022
From: kaustubh23593 at gmail.com (Kaustubh Khedkar)
Date: Fri, 10 Jun 2022 12:11:33 -0700
Subject: [petsc-users] Error with PetscMatlabEngineCreate
Message-ID: <3A0E58B8-A166-4024-B615-CD353F666C02@gmail.com>

Hi all,

I am using the Petsc?s Matlab engine to run some Matlab scripts from my c++ code.

I have been using Petsc?s Github commit: 9babe2dd5ff256baf1aab74d81ff9ed4c6baba0b 
(HEAD -> master, origin/master, origin/HEAD)
Merge: e9b74a6d12 bb2d6e605a
Author: Satish Balay <balay at mcs.anl.gov>
Date:   Fri Nov 6 17:46:10 2020 +0000

I used the command:

PetscMatlabEngineCreate(PETSC_COMM_SELF, "master", &(mengine));

where, the hostname is master. (Verified by typing hostname in the terminal)

Everything was working fine until I updated my PETSc version to 3.17.2. 
Using this version I get error using the command:

PetscMatlabEngineEvaluate(mengine, "load_parameters;?);

cannot read load_parameters script.  where, load_parameters is a Matlab script.

When I switch the hostname to NULL from master as:

PetscMatlabEngineCreate(PETSC_COMM_SELF, NULL, &(mengine));

Everything starts working fine again. All of this was executed on the same machine.

Has anything changed when using the PetscMatlabEngineEvaluate command?


Thank you,
Kaustubh Khedkar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220610/a3ce1041/attachment-0001.html>

From lokie1372 at gmail.com  Fri Jun 10 16:00:30 2022
From: lokie1372 at gmail.com (luciano Hammond Noratto)
Date: Fri, 10 Jun 2022 23:00:30 +0200
Subject: [petsc-users] ead a matrix and vector from a file and solve a
 linear system in parallel
Message-ID: <CACJF0G2hCK=9-BOPO9+C4qeiU58KB8ihHy5DAPWnSRyNCTUyMA@mail.gmail.com>

Hello,

I was trying to read a matrix and vector from here
https://sparse.tamu.edu/FEMLAB/waveguide3D and then solve the linear system
in parallel without success. I'm new in PETSC, I'd really appreciate it if
someone could help me to solve this problem.

Luciano
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220610/d9483822/attachment.html>

From xsli at lbl.gov  Fri Jun 10 19:35:01 2022
From: xsli at lbl.gov (Xiaoye S. Li)
Date: Fri, 10 Jun 2022 17:35:01 -0700
Subject: [petsc-users] [EXTERNAL] Re: Question about SuperLU
In-Reply-To: <f0a14efe1f974eea957cf8a63723b4cd@lanl.gov>
References: <ca06f6e208114e67bd7b45d1af1fcd22@lanl.gov>
	<D87220F6-DF2C-4375-B241-33423C14D185@petsc.dev>
	<f0a14efe1f974eea957cf8a63723b4cd@lanl.gov>
Message-ID: <CAFvbobU=B3UUe8g-UnzLejrE82Gq6q-N4sX8wRMQseAD+824-Q@mail.gmail.com>

Could that be due to "numerical zero pivot" (caused due to cancellation and
underflow)?  You can try to force diagonal to be nonzero.

Looking at the options page:
https://petsc.org/main/docs/manualpages/Mat/MATSOLVERSUPERLU_DIST/

You can enable this one:

-mat_superlu_dist_replacetinypivot

replace tiny pivots
(the default is NO, not to replace tiny pivots, including zero pivots.)

Sherry


On Fri, Jun 10, 2022 at 4:31 PM Jorti, Zakariae via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Hi,
>
> Thank you all for your answers.
> I have tried your suggestions and here is what I found.
> Barry you were right about the first case. But in the second case, I am
> not using a Schur fieldsplit but a multiplicative fieldsplit :
> -fieldsplit_TEBV_fieldsplit_EBV_pc_type fieldsplit
> -fieldsplit_TEBV_fieldsplit_EBV_pc_fieldsplit_type multiplicative
>
> So, there should not be any Schur complement approximation Sp.
>
> When I ran a test with
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_view_pre
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_error_if_not_converged, I
> got this error:
>
>
>     0 SNES Function norm 6.368031218939e-02
>       0 KSP Residual norm 6.368031218939e-02
>       Linear fieldsplit_ni_ solve converged due to CONVERGED_ATOL
> iterations 0
>         Linear fieldsplit_TEBV_fieldsplit_tau_ solve converged due to
> CONVERGED_RTOL iterations 1
>           Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_EP_ solve
> converged due to CONVERGED_RTOL iterations 3
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [0]PETSC ERROR: Zero pivot in LU factorization:
> https://petsc.org/release/faq/#zeropivot
> [0]PETSC ERROR: Zero pivot in row 1658
> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
> [0]PETSC ERROR: Petsc Development GIT revision: v3.16.3-751-g2f43bd9bc3
> GIT Date: 2022-01-26 22:34:02 -0600
> [0]PETSC ERROR: ./main on a macx named pn2032683.lanl.gov by zjorti Fri
> Jun 10 16:17:35 2022
> [0]PETSC ERROR: Configure options PETSC_ARCH=macx --with-fc=0
> --with-mpi-dir=/Users/zjorti/.brew --download-hypre --with-debugging=0
> --with-cxx-dialect=C++11 --download-superlu_dist --download-parmetis
> --download-metis --download-ptscotch --download-cmake
>
>
> Then I tried this flag
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_view_pmat
> binary:BVmat:binary_matlab and checked the resulting matrix in Matlab.
> For Matlab, this is a full rank matrix, and the LU factorization there was
> carried out without any issues.
>
> I also outputted the BV block directly from the Jacobian matrix.
> Once again, according to Matlab, it is a full rank matrix and it computes
> the LU factorization without any problem.
>
>
> ------------------------------
> *From:* Barry Smith <bsmith at petsc.dev>
> *Sent:* Friday, June 10, 2022 7:32 AM
> *To:* Jorti, Zakariae
> *Cc:* petsc-users at mcs.anl.gov; Tang, Xianzhu
> *Subject:* [EXTERNAL] Re: [petsc-users] Question about SuperLU
>
>
>   It is difficult to tell exactly how the preconditioner is being formed
> with the information below it looks like in the
>
> first case: the original B diagonal block and V diagonal block of the
> matrix are being factored separately with SuperLU_DIST
>
> second case: the B block is factored with SuperLU_DIST and an explicit
> approximation to a Schur complement of the V block (Schur complement on
> eliminating the B block) is formed using "Preconditioner for the Schur
> complement formed from Sp, an assembled approximation to S, which uses
> A00's %sdiagonal's inverse" (this is the printout from a KSPView() for this
> part of the preconditioner).
>
>   My guess is you have a "Stokes"-like problem where the V block is
> identically 0 so, of course, the SuperLU_DIST will fail on it. But the
> approximation of the Schur complement onto that block is not singular so
> SuperLU_DIST has no trouble. If I am wrong and the V block is not
> identically 0 then it may be singular (or possibly, but less likely just
> badly order) so that SuperLU_DIST encounters a zero pivot.
>
>   You can run with -ksp_view_pre to have the KSP print the KSP solver
> algorithm details BEFORE the linear solve (hence they would get printed
> despite your failed solve). That would be useful to see exactly what your
> preconditioner is.
>
>    You can use -ksp_view_pmat (with appropriate prefix) to display the
> matrix that is going to be factored. Thus you can quickly verify what V is.
>
>   If you run with -ksp_error_if_not_converged then the solver will stop
> exactly when the zero pivot is encountered; this would include some
> information from SuperLU_DIST which might include the row number etc.
>
>   Notes on PETSc improvements needed.
>
> 1) The man page for KSPCheckSolve() is terribly misleading
>
> 2) It would be nice to have a view that displayed the nested fieldsplit
> preconditioners more clearly
>
>
>
>
>
>
> On Jun 9, 2022, at 5:19 PM, Jorti, Zakariae via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
> Hi,
>
> I am solving non-linear problem that has 5 unknowns {ni, T, E, B, V}, and
> for the preconditioning part, I am using a FieldSplit preconditioner. At
> the last fieldsplit/level, we are left with a {B,V} block that tried to
> precondition in 2 different ways:
> a) SuperLU:
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type preonly
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type lu
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_factor_mat_solver_type
> superlu_dist
> b) a Schur-based fieldsplit preconditioner that uses SuperLU for both V
> and B blocks:
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type gmres
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type fieldsplit
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_type schur
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_schur_precondition
> selfp -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_ksp_type
> preonly -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_pc_type
> lu
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_pc_factor_mat_solver_type
> superlu_dist
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_ksp_type preonly
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_pc_type lu
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_pc_factor_mat_solver_type
> superlu_dist
>
> Option a) yields the following error:
> "     Linear fieldsplit_ni_ solve converged due to CONVERGED_ATOL
> iterations 0
>         Linear fieldsplit_TEBV_fieldsplit_tau_ solve converged due to
> CONVERGED_RTOL iterations 1
>           Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_EP_ solve
> converged due to CONVERGED_RTOL iterations 5
>           Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ solve did
> not converge due to DIVERGED_PC_FAILED iterations 0
>                          PC failed due to FACTOR_NUMERIC_ZEROPIVOT "
> whereas options b) seems to be working well.
> Is it possible that the SuperLU on the {V,B} block uses a reordering that
> introduces a zero pivot or could there be another explanation for this
> error?
>
> Many thanks.
> Best,
>
> Zakariae
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220610/c309b949/attachment.html>

From xsli at lbl.gov  Fri Jun 10 19:44:05 2022
From: xsli at lbl.gov (Xiaoye S. Li)
Date: Fri, 10 Jun 2022 17:44:05 -0700
Subject: [petsc-users] Question about SuperLU
In-Reply-To: <451BCBEE-FCC1-44D2-946F-4AF80403E6A9@msu.edu>
References: <451BCBEE-FCC1-44D2-946F-4AF80403E6A9@msu.edu>
Message-ID: <CAFvbobXJ2NBXwJQ+L+vhSXq-kN7jcgu1-d5oLGZdxjXhSHTfUA@mail.gmail.com>

On Fri, Jun 10, 2022 at 7:43 AM Tang, Qi <tangqi at msu.edu> wrote:

> ?We use superlu_dist.
>
> We have a 2 x 2 block where directly calling suplerlu_dist fails, but a pc
> based on a fieldsplit Schur complement + superlu_dist on the assembled
> Schur complement matrix converges. (All the converge criteria are default
> at this level)
>
> I am having a hard time to understand what is going on. The B,V block is
> of size 240K, so it is also hard to analyze. And the mat is not something
> we explicitly formed. It is formed by finite difference coloring Jacobian +
> a few levels of Schur complement.
>
>   / A 0 \
>   \ 0  I /
>
> Matt, I do not see this can explain why the second pc with superlu on S =
> A would succeed, if A is not full rank.
>
> I believe I found somewhere it says petsc?s pclu (or maybe superlu_dist)
> did reordering and it may introduce 0 pivoting. We are asking because it
> seems there is something we do not understand from pclu/superlu level.
>

If the matrix is non-singular, and you use the RowPerm default option:
LargeDiag_MC64, then you won't have zero pivot (in a structural sense),
unless numerical cancellation causes the diagonal element underflow, then
flush to zero.

You can try to set ReplaceTinyPivot:
-mat_superlu_dist_replacetinypivot

replace tiny pivots
See my reply in another email.

Sherry


> Anyway, is there a way to output the mat before it fails? We have been
> struggling to do that. We have TSSolve->SNES->FDColoringJacobian->A few
> levels of fieldsplit->failed Subblock matrix, which we want to analyze.
> (Sometimes it even happens in the second Newton iteration as the first one
> works okay.)
>
> Qi
>
>
>
> On Jun 10, 2022, at 8:11 AM, Matthew Knepley <knepley at gmail.com> wrote:
>
> ?
> On Thu, Jun 9, 2022 at 5:20 PM Jorti, Zakariae via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
>> Hi,
>>
>> I am solving non-linear problem that has 5 unknowns {ni, T, E, B, V}, and
>> for the preconditioning part, I am using a FieldSplit preconditioner. At
>> the last fieldsplit/level, we are left with a {B,V} block that tried to
>> precondition in 2 different ways:
>> a) SuperLU:
>> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type preonly
>> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type lu
>> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_factor_mat_solver_type
>> superlu_dist
>> b) a Schur-based fieldsplit preconditioner that uses SuperLU for both V
>> and B blocks:
>> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type gmres
>> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type fieldsplit
>> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_type schur
>> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_schur_precondition
>> selfp -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_ksp_type
>> preonly -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_pc_type
>> lu
>> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_pc_factor_mat_solver_type
>> superlu_dist
>> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_ksp_type preonly
>> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_pc_type lu
>> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_pc_factor_mat_solver_type
>> superlu_dist
>>
>> Option a) yields the following error:
>> "     Linear fieldsplit_ni_ solve converged due to CONVERGED_ATOL
>> iterations 0
>>         Linear fieldsplit_TEBV_fieldsplit_tau_ solve converged due to
>> CONVERGED_RTOL iterations 1
>>           Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_EP_ solve
>> converged due to CONVERGED_RTOL iterations 5
>>           Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ solve did
>> not converge due to DIVERGED_PC_FAILED iterations 0
>>                          PC failed due to FACTOR_NUMERIC_ZEROPIVOT "
>> whereas options b) seems to be working well.
>> Is it possible that the SuperLU on the {V,B} block uses a reordering that
>> introduces a zero pivot or could there be another explanation for this
>> error?
>>
>
> I can at least come up with a case where this is true. Suppose you have
>
>   / A 0 \
>   \ 0  I /
>
> where A is rank deficient, but has a positive diagonal. SuperLU will fail
> since it is actually singular. However, your Schur complement might work
> since you use
> 'selfp' for the Schur preconditioner, and it just extracts the diagonal.
>
>   Thanks,
>
>      Matt
>
>
>> Many thanks.
>> Best,
>>
>> Zakariae
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <https://urldefense.com/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!HXCxUKc!3sg7TZfA3Z8m6foZ2wvm-LDZ6jI9C-kp1FVdsCbdXrDj--rBHyrWkz0akiZApgTAMNYce0yg5uta7w$>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220610/75a18aeb/attachment-0001.html>

From mail2amneet at gmail.com  Fri Jun 10 22:51:55 2022
From: mail2amneet at gmail.com (Amneet Bhalla)
Date: Fri, 10 Jun 2022 20:51:55 -0700
Subject: [petsc-users] Error with PetscMatlabEngineCreate
In-Reply-To: <8091FE26-A225-4461-BC7D-1E5362D65DD6@petsc.dev>
References: <EB3C483A-7482-45A1-AAE2-D15716AD95F2@sdsu.edu>
	<8091FE26-A225-4461-BC7D-1E5362D65DD6@petsc.dev>
Message-ID: <CAMETWJ1j7tnXa-g3pJbAf6HmABGTWLqULOdJM5XYTd4QMPywvg@mail.gmail.com>

Thanks Barry. Adding absolute paths made it work with "master" hostname. We
have added both options in our code just in case the NULL hostname does not
work on a different machine.
https://github.com/IBAMR/cfd-mpc-wecs/blob/main/main.cpp#L346-L361

On Fri, Jun 10, 2022 at 1:48 PM Barry Smith <bsmith at petsc.dev> wrote:

>
>   Based on your report the issue is likely due to a MATLABPATH issue. The
> difference between using "master" and NULL is that when "master" is used,
> PETSc ssh's to "master" to startup the Matlab engine while with NULL it
> launches the Matlab engine directly from the current process in
> (presumably) the current directory.
>
>    When ssh is used it does not have information about the current
> directory nor would it have any tweaks you have made to your MATLABPATH in
> your shell. Thus it cannot find the script.
>
>    Thus when not using NULL you need to make sure that all scripts you
> plan to launch are findable in the MATLABPATH on the machine you are
> launching the Matlab engine, maybe by putting the directories in MATLABPATH
> on in your .bashrc or .profile file or whatever file gets sourced
> automatically when you ssh to master.
>
>   Barry
>
>  I have no explanation why the behavior would change with PETSc versions
> or Matlab versions but the above should resolve the problem; you may have
> previously just been "lucky" it could fine the script.
>
>
>
> On Jun 10, 2022, at 4:14 PM, Kaustubh Khedkar via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
> Hi all,
>
> I am using the Petsc?s Matlab engine to run some Matlab scripts from my
> c++ code.
>
> I have been using Petsc?s Github commit:
> 9babe2dd5ff256baf1aab74d81ff9ed4c6baba0b
> (HEAD -> master, origin/master, origin/HEAD)
> Merge: e9b74a6d12 bb2d6e605a
> Author: Satish Balay <balay at mcs.anl.gov>
> Date:   Fri Nov 6 17:46:10 2020 +0000
>
> I used the command:
>
> PetscMatlabEngineCreate(PETSC_COMM_SELF, "master", &(mengine));
>
> where, the hostname is master. (Verified by typing hostname in the
> terminal)
>
> Everything was working fine until I updated my PETSc version to 3.17.2.
> Using this version I get error using the command:
>
> PetscMatlabEngineEvaluate(mengine, "load_parameters;?);
>
> cannot read load_parameters script.  where, load_parameters is a Matlab
> script.
>
> When I switch the hostname to NULL from master as:
>
> PetscMatlabEngineCreate(PETSC_COMM_SELF, NULL, &(mengine));
>
> Everything starts working fine again. All of this was executed on the same
> machine.
>
> Has anything changed when using the PetscMatlabEngineEvaluate command?
>
>
> Thank you,
> Kaustubh Khedkar
>
>
>

-- 
--Amneet
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220610/90a47701/attachment.html>

From mfadams at lbl.gov  Sat Jun 11 09:25:36 2022
From: mfadams at lbl.gov (Mark Adams)
Date: Sat, 11 Jun 2022 10:25:36 -0400
Subject: [petsc-users] ead a matrix and vector from a file and solve a
 linear system in parallel
In-Reply-To: <CACJF0G2hCK=9-BOPO9+C4qeiU58KB8ihHy5DAPWnSRyNCTUyMA@mail.gmail.com>
References: <CACJF0G2hCK=9-BOPO9+C4qeiU58KB8ihHy5DAPWnSRyNCTUyMA@mail.gmail.com>
Message-ID: <CADOhEh5TWUCD4dfge5yruCWBfqVHFjQq2tNqjOz5rCTc_vntZQ@mail.gmail.com>

Others would know more than me but there is a non-complex example
in src/mat/tests/ex72.c

Mark

On Fri, Jun 10, 2022 at 7:52 PM luciano Hammond Noratto <lokie1372 at gmail.com>
wrote:

> Hello,
>
> I was trying to read a matrix and vector from here
> https://sparse.tamu.edu/FEMLAB/waveguide3D and then solve the linear
> system in parallel without success. I'm new in PETSC, I'd really appreciate
> it if someone could help me to solve this problem.
>
> Luciano
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220611/57683c93/attachment.html>

From bsmith at petsc.dev  Sat Jun 11 09:45:56 2022
From: bsmith at petsc.dev (Barry Smith)
Date: Sat, 11 Jun 2022 10:45:56 -0400
Subject: [petsc-users] [EXTERNAL]  Question about SuperLU
In-Reply-To: <f0a14efe1f974eea957cf8a63723b4cd@lanl.gov>
References: <ca06f6e208114e67bd7b45d1af1fcd22@lanl.gov>
	<D87220F6-DF2C-4375-B241-33423C14D185@petsc.dev>
	<f0a14efe1f974eea957cf8a63723b4cd@lanl.gov>
Message-ID: <3BD31D76-AC27-47E5-85E2-9759C3ADF98C@petsc.dev>



> On Jun 10, 2022, at 7:30 PM, Jorti, Zakariae <zjorti at lanl.gov> wrote:
> 
> Hi, 
> 
> Thank you all for your answers. 
> I have tried your suggestions and here is what I found. 
> Barry you were right about the first case. But in the second case, I am not using a Schur fieldsplit but a multiplicative fieldsplit : -fieldsplit_TEBV_fieldsplit_EBV_pc_type fieldsplit -fieldsplit_TEBV_fieldsplit_EBV_pc_fieldsplit_type multiplicative 

  The previous email indicated

> b) a Schur-based fieldsplit preconditioner that uses SuperLU for both V and B blocks:
> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type gmres -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type fieldsplit -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_type schur -

  which means there is a Schur complement PC inside the multiplicative so my explanation that the Schur complement "saves" the problem by passing into SuperLU_DIST a non-singular matrix that is some approximation to the Schur complement could be true.

> Then I tried this flag -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_view_pmat binary:BVmat:binary_matlab and checked the resulting matrix in Matlab. 
> For Matlab, this is a full rank matrix, and the LU factorization there was carried out without any issues.
> I also outputted the BV block directly from the Jacobian matrix. 
> Once again, according to Matlab, it is a full rank matrix and it computes the LU factorization without any problem.

  The BV matrix you saved into Matlab is a "block" matrix where the first block is B and the second block V (presumably both the same size). Can you, in Matlab, extract the two blocks separated and examine them (via say spy) and also have Matlab factor each of them separately? In your failed fieldsplit case SuperLU_DIST is factoring each of these matrices separately which could produce a zero pivot that would not occur when the larger matrix (of both blocks) is factored together. Let's see what happens with Matlab's solver.

   It looks like you are running on one rank?  If the above process is not informative this is what you do next. 

   Use PetscBinaryWrite() from Matlab to save each of the two blocks (one for B and one for V) to two files. Then use a simple standalone PETSc code, say src/ksp/ksp/tutorials/ex10.c to read each of the files and use SuperLU_DIST directly on each of the two linear systems. This will, at least to my understanding, result in the exact same SuperLU_DIST solves that you get with the failed use of PCFIELDSPLIT. If they succeed or fail will be very informative.

  Barry



> 
> So, there should not be any Schur complement approximation Sp. 
> 
> When I ran a test with -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_view_pre -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_error_if_not_converged, I got this error: 
> 
>     0 SNES Function norm 6.368031218939e-02 
>       0 KSP Residual norm 6.368031218939e-02 
>       Linear fieldsplit_ni_ solve converged due to CONVERGED_ATOL iterations 0
>         Linear fieldsplit_TEBV_fieldsplit_tau_ solve converged due to CONVERGED_RTOL iterations 1
>           Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_EP_ solve converged due to CONVERGED_RTOL iterations 3
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR: Zero pivot in LU factorization: https://petsc.org/release/faq/#zeropivot <https://petsc.org/release/faq/#zeropivot>
> [0]PETSC ERROR: Zero pivot in row 1658
> [0]PETSC ERROR: See https://petsc.org/release/faq/ <https://petsc.org/release/faq/> for trouble shooting.
> [0]PETSC ERROR: Petsc Development GIT revision: v3.16.3-751-g2f43bd9bc3  GIT Date: 2022-01-26 22:34:02 -0600
> [0]PETSC ERROR: ./main on a macx named pn2032683.lanl.gov <http://pn2032683.lanl.gov/> by zjorti Fri Jun 10 16:17:35 2022
> [0]PETSC ERROR: Configure options PETSC_ARCH=macx --with-fc=0 --with-mpi-dir=/Users/zjorti/.brew --download-hypre --with-debugging=0 --with-cxx-dialect=C++11 --download-superlu_dist --download-parmetis --download-metis --download-ptscotch --download-cmake
> 
> 
> Then I tried this flag -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_view_pmat binary:BVmat:binary_matlab and checked the resulting matrix in Matlab. 
> For Matlab, this is a full rank matrix, and the LU factorization there was carried out without any issues.
> I also outputted the BV block directly from the Jacobian matrix. 
> Once again, according to Matlab, it is a full rank matrix and it computes the LU factorization without any problem.
> 
> 
> From: Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>
> Sent: Friday, June 10, 2022 7:32 AM
> To: Jorti, Zakariae
> Cc: petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>; Tang, Xianzhu
> Subject: [EXTERNAL] Re: [petsc-users] Question about SuperLU
>  
> 
>   It is difficult to tell exactly how the preconditioner is being formed with the information below it looks like in the 
> 
> first case: the original B diagonal block and V diagonal block of the matrix are being factored separately with SuperLU_DIST 
> 
> second case: the B block is factored with SuperLU_DIST and an explicit approximation to a Schur complement of the V block (Schur complement on eliminating the B block) is formed using "Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses A00's %sdiagonal's inverse" (this is the printout from a KSPView() for this part of the preconditioner). 
> 
>   My guess is you have a "Stokes"-like problem where the V block is identically 0 so, of course, the SuperLU_DIST will fail on it. But the approximation of the Schur complement onto that block is not singular so SuperLU_DIST has no trouble. If I am wrong and the V block is not identically 0 then it may be singular (or possibly, but less likely just badly order) so that SuperLU_DIST encounters a zero pivot.
> 
>   You can run with -ksp_view_pre to have the KSP print the KSP solver algorithm details BEFORE the linear solve (hence they would get printed despite your failed solve). That would be useful to see exactly what your preconditioner is.
> 
>    You can use -ksp_view_pmat (with appropriate prefix) to display the matrix that is going to be factored. Thus you can quickly verify what V is.
> 
>   If you run with -ksp_error_if_not_converged then the solver will stop exactly when the zero pivot is encountered; this would include some information from SuperLU_DIST which might include the row number etc. 
> 
>   Notes on PETSc improvements needed. 
> 
> 1) The man page for KSPCheckSolve() is terribly misleading
> 
> 2) It would be nice to have a view that displayed the nested fieldsplit preconditioners more clearly
> 
> 
> 
> 
> 
> 
>> On Jun 9, 2022, at 5:19 PM, Jorti, Zakariae via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>> 
>> Hi, 
>> 
>> I am solving non-linear problem that has 5 unknowns {ni, T, E, B, V}, and for the preconditioning part, I am using a FieldSplit preconditioner. At the last fieldsplit/level, we are left with a {B,V} block that tried to precondition in 2 different ways: 
>> a) SuperLU: 
>> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type preonly -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type lu -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_factor_mat_solver_type superlu_dist  
>> b) a Schur-based fieldsplit preconditioner that uses SuperLU for both V and B blocks:
>> -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type gmres -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type fieldsplit -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_type schur -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_schur_precondition selfp -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_ksp_type preonly -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_pc_type lu -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_pc_factor_mat_solver_type superlu_dist -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_ksp_type preonly -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_pc_type lu -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_pc_factor_mat_solver_type superlu_dist
>> 
>> Option a) yields the following error: 
>> "     Linear fieldsplit_ni_ solve converged due to CONVERGED_ATOL iterations 0
>>         Linear fieldsplit_TEBV_fieldsplit_tau_ solve converged due to CONVERGED_RTOL iterations 1
>>           Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_EP_ solve converged due to CONVERGED_RTOL iterations 5
>>           Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ solve did not converge due to DIVERGED_PC_FAILED iterations 0
>>                          PC failed due to FACTOR_NUMERIC_ZEROPIVOT " 
>> whereas options b) seems to be working well. 
>> Is it possible that the SuperLU on the {V,B} block uses a reordering that introduces a zero pivot or could there be another explanation for this error?
>> 
>> Many thanks. 
>> Best,
>> 
>> Zakariae

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220611/ea650cf6/attachment-0001.html>

From hzhang at mcs.anl.gov  Sat Jun 11 10:15:07 2022
From: hzhang at mcs.anl.gov (Zhang, Hong)
Date: Sat, 11 Jun 2022 15:15:07 +0000
Subject: [petsc-users] [EXTERNAL]  Question about SuperLU
In-Reply-To: <3BD31D76-AC27-47E5-85E2-9759C3ADF98C@petsc.dev>
References: <ca06f6e208114e67bd7b45d1af1fcd22@lanl.gov>
	<D87220F6-DF2C-4375-B241-33423C14D185@petsc.dev>
	<f0a14efe1f974eea957cf8a63723b4cd@lanl.gov>
	<3BD31D76-AC27-47E5-85E2-9759C3ADF98C@petsc.dev>
Message-ID: <SA1PR09MB8607B06998B5085E1F43B5C888A99@SA1PR09MB8607.namprd09.prod.outlook.com>

If each block is sequential, try replace SuperLU_DIST with SuperLU, which would be more robust. You may also try MUMPS LU.
Hong
________________________________
From: petsc-users <petsc-users-bounces at mcs.anl.gov> on behalf of Barry Smith <bsmith at petsc.dev>
Sent: Saturday, June 11, 2022 9:45 AM
To: Jorti, Zakariae <zjorti at lanl.gov>
Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>; Tang, Xianzhu <xtang at lanl.gov>
Subject: Re: [petsc-users] [EXTERNAL] Question about SuperLU



On Jun 10, 2022, at 7:30 PM, Jorti, Zakariae <zjorti at lanl.gov<mailto:zjorti at lanl.gov>> wrote:

Hi,

Thank you all for your answers.
I have tried your suggestions and here is what I found.
Barry you were right about the first case. But in the second case, I am not using a Schur fieldsplit but a multiplicative fieldsplit : -fieldsplit_TEBV_fieldsplit_EBV_pc_type fieldsplit -fieldsplit_TEBV_fieldsplit_EBV_pc_fieldsplit_type multiplicative

  The previous email indicated

b) a Schur-based fieldsplit preconditioner that uses SuperLU for both V and B blocks:
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type gmres -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type fieldsplit -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_type schur -

  which means there is a Schur complement PC inside the multiplicative so my explanation that the Schur complement "saves" the problem by passing into SuperLU_DIST a non-singular matrix that is some approximation to the Schur complement could be true.

Then I tried this flag -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_view_pmat binary:BVmat:binary_matlab and checked the resulting matrix in Matlab.
For Matlab, this is a full rank matrix, and the LU factorization there was carried out without any issues.
I also outputted the BV block directly from the Jacobian matrix.
Once again, according to Matlab, it is a full rank matrix and it computes the LU factorization without any problem.

  The BV matrix you saved into Matlab is a "block" matrix where the first block is B and the second block V (presumably both the same size). Can you, in Matlab, extract the two blocks separated and examine them (via say spy) and also have Matlab factor each of them separately? In your failed fieldsplit case SuperLU_DIST is factoring each of these matrices separately which could produce a zero pivot that would not occur when the larger matrix (of both blocks) is factored together. Let's see what happens with Matlab's solver.

   It looks like you are running on one rank?  If the above process is not informative this is what you do next.

   Use PetscBinaryWrite() from Matlab to save each of the two blocks (one for B and one for V) to two files. Then use a simple standalone PETSc code, say src/ksp/ksp/tutorials/ex10.c to read each of the files and use SuperLU_DIST directly on each of the two linear systems. This will, at least to my understanding, result in the exact same SuperLU_DIST solves that you get with the failed use of PCFIELDSPLIT. If they succeed or fail will be very informative.

  Barry




So, there should not be any Schur complement approximation Sp.

When I ran a test with -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_view_pre -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_error_if_not_converged, I got this error:

    0 SNES Function norm 6.368031218939e-02
      0 KSP Residual norm 6.368031218939e-02
      Linear fieldsplit_ni_ solve converged due to CONVERGED_ATOL iterations 0
        Linear fieldsplit_TEBV_fieldsplit_tau_ solve converged due to CONVERGED_RTOL iterations 1
          Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_EP_ solve converged due to CONVERGED_RTOL iterations 3
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Zero pivot in LU factorization: https://petsc.org/release/faq/#zeropivot
[0]PETSC ERROR: Zero pivot in row 1658
[0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[0]PETSC ERROR: Petsc Development GIT revision: v3.16.3-751-g2f43bd9bc3  GIT Date: 2022-01-26 22:34:02 -0600
[0]PETSC ERROR: ./main on a macx named pn2032683.lanl.gov<http://pn2032683.lanl.gov/> by zjorti Fri Jun 10 16:17:35 2022
[0]PETSC ERROR: Configure options PETSC_ARCH=macx --with-fc=0 --with-mpi-dir=/Users/zjorti/.brew --download-hypre --with-debugging=0 --with-cxx-dialect=C++11 --download-superlu_dist --download-parmetis --download-metis --download-ptscotch --download-cmake



Then I tried this flag -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_view_pmat binary:BVmat:binary_matlab and checked the resulting matrix in Matlab.
For Matlab, this is a full rank matrix, and the LU factorization there was carried out without any issues.
I also outputted the BV block directly from the Jacobian matrix.
Once again, according to Matlab, it is a full rank matrix and it computes the LU factorization without any problem.


________________________________
From: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>
Sent: Friday, June 10, 2022 7:32 AM
To: Jorti, Zakariae
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>; Tang, Xianzhu
Subject: [EXTERNAL] Re: [petsc-users] Question about SuperLU


  It is difficult to tell exactly how the preconditioner is being formed with the information below it looks like in the

first case: the original B diagonal block and V diagonal block of the matrix are being factored separately with SuperLU_DIST

second case: the B block is factored with SuperLU_DIST and an explicit approximation to a Schur complement of the V block (Schur complement on eliminating the B block) is formed using "Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses A00's %sdiagonal's inverse" (this is the printout from a KSPView() for this part of the preconditioner).

  My guess is you have a "Stokes"-like problem where the V block is identically 0 so, of course, the SuperLU_DIST will fail on it. But the approximation of the Schur complement onto that block is not singular so SuperLU_DIST has no trouble. If I am wrong and the V block is not identically 0 then it may be singular (or possibly, but less likely just badly order) so that SuperLU_DIST encounters a zero pivot.

  You can run with -ksp_view_pre to have the KSP print the KSP solver algorithm details BEFORE the linear solve (hence they would get printed despite your failed solve). That would be useful to see exactly what your preconditioner is.

   You can use -ksp_view_pmat (with appropriate prefix) to display the matrix that is going to be factored. Thus you can quickly verify what V is.

  If you run with -ksp_error_if_not_converged then the solver will stop exactly when the zero pivot is encountered; this would include some information from SuperLU_DIST which might include the row number etc.

  Notes on PETSc improvements needed.

1) The man page for KSPCheckSolve() is terribly misleading

2) It would be nice to have a view that displayed the nested fieldsplit preconditioners more clearly






On Jun 9, 2022, at 5:19 PM, Jorti, Zakariae via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:


Hi,

I am solving non-linear problem that has 5 unknowns {ni, T, E, B, V}, and for the preconditioning part, I am using a FieldSplit preconditioner. At the last fieldsplit/level, we are left with a {B,V} block that tried to precondition in 2 different ways:
a) SuperLU:
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type preonly -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type lu -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_factor_mat_solver_type superlu_dist
b) a Schur-based fieldsplit preconditioner that uses SuperLU for both V and B blocks:
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type gmres -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type fieldsplit -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_type schur -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_schur_precondition selfp -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_ksp_type preonly -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_pc_type lu -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_pc_factor_mat_solver_type superlu_dist -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_ksp_type preonly -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_pc_type lu -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_pc_factor_mat_solver_type superlu_dist

Option a) yields the following error:
"     Linear fieldsplit_ni_ solve converged due to CONVERGED_ATOL iterations 0
        Linear fieldsplit_TEBV_fieldsplit_tau_ solve converged due to CONVERGED_RTOL iterations 1
          Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_EP_ solve converged due to CONVERGED_RTOL iterations 5
          Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ solve did not converge due to DIVERGED_PC_FAILED iterations 0
                         PC failed due to FACTOR_NUMERIC_ZEROPIVOT "
whereas options b) seems to be working well.
Is it possible that the SuperLU on the {V,B} block uses a reordering that introduces a zero pivot or could there be another explanation for this error?

Many thanks.
Best,

Zakariae

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220611/a4b06f43/attachment-0001.html>

From tangqi at msu.edu  Sat Jun 11 12:39:25 2022
From: tangqi at msu.edu (Tang, Qi)
Date: Sat, 11 Jun 2022 17:39:25 +0000
Subject: [petsc-users] [EXTERNAL]  Question about SuperLU
In-Reply-To: <3BD31D76-AC27-47E5-85E2-9759C3ADF98C@petsc.dev>
References: <ca06f6e208114e67bd7b45d1af1fcd22@lanl.gov>
	<D87220F6-DF2C-4375-B241-33423C14D185@petsc.dev>
	<f0a14efe1f974eea957cf8a63723b4cd@lanl.gov>
	<3BD31D76-AC27-47E5-85E2-9759C3ADF98C@petsc.dev>
Message-ID: <1B71D61A-AEED-4420-B93A-AAF786E92C5F@msu.edu>

Thanks for explaining. Let me summarize what we found so far. Barry was correct on the fieldsplit comment.
* Applying superlu_dist to the entire BV block failed (?together factorization?)
* Applying superlu_dist to the selfp version of schur complement for B and the diagonal sub-block for V succeeded (?separate factorization?). The solution looks fine. It is a regularized saddle point problem, so the diagonal blocks are not singular.

There is no implementation to switch between two options on our end, so that should exclude any potential bugs in the solver level.

The original size of BV block is 240K, so we have to use superlu_dist. We also check its precondition number and it looks fine.

Now we downsize the problem so that we can analyze in matlab. The BV size becomes roughly 10K. We checked various things using matlab and it seems the matrix looks fine from matlab. But superlu_dist still runs into zero pivoting error (all default options on superlu_dist). Yes, we should try superlu or mumps, which is a good suggestion. We will also try change the superlu_dist flag as Sherry suggested. Now we know a few detections to do on petsc/matlab. Thanks for all the good suggestions.

Qi


On Jun 11, 2022, at 8:46 AM, Barry Smith <bsmith at petsc.dev> wrote:

?

On Jun 10, 2022, at 7:30 PM, Jorti, Zakariae <zjorti at lanl.gov<mailto:zjorti at lanl.gov>> wrote:

Hi,

Thank you all for your answers.,
I have tried your suggestions and here is what I found.
Barry you were right about the first case. But in the second case, I am not using a Schur fieldsplit but a multiplicative fieldsplit : -fieldsplit_TEBV_fieldsplit_EBV_pc_type fieldsplit -fieldsplit_TEBV_fieldsplit_EBV_pc_fieldsplit_type multiplicative

  The previous email indicated

b) a Schur-based fieldsplit preconditioner that uses SuperLU for both V and B blocks:
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type gmres -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type fieldsplit -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_type schur -

  which means there is a Schur complement PC inside the multiplicative so my explanation that the Schur complement "saves" the problem by passing into SuperLU_DIST a non-singular matrix that is some approximation to the Schur complement could be true.

Then I tried this flag -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_view_pmat binary:BVmat:binary_matlab and checked the resulting matrix in Matlab.
For Matlab, this is a full rank matrix, and the LU factorization there was carried out without any issues.
I also outputted the BV block directly from the Jacobian matrix.
Once again, according to Matlab, it is a full rank matrix and it computes the LU factorization without any problem.

  The BV matrix you saved into Matlab is a "block" matrix where the first block is B and the second block V (presumably both the same size). Can you, in Matlab, extract the two blocks separated and examine them (via say spy) and also have Matlab factor each of them separately? In your failed fieldsplit case SuperLU_DIST is factoring each of these matrices separately which could produce a zero pivot that would not occur when the larger matrix (of both blocks) is factored together. Let's see what happens with Matlab's solver.

   It looks like you are running on one rank?  If the above process is not informative this is what you do next.

   Use PetscBinaryWrite() from Matlab to save each of the two blocks (one for B and one for V) to two files. Then use a simple standalone PETSc code, say src/ksp/ksp/tutorials/ex10.c to read each of the files and use SuperLU_DIST directly on each of the two linear systems. This will, at least to my understanding, result in the exact same SuperLU_DIST solves that you get with the failed use of PCFIELDSPLIT. If they succeed or fail will be very informative.

  Barry




So, there should not be any Schur complement approximation Sp.

When I ran a test with -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_view_pre -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_error_if_not_converged, I got this error:

    0 SNES Function norm 6.368031218939e-02
      0 KSP Residual norm 6.368031218939e-02
      Linear fieldsplit_ni_ solve converged due to CONVERGED_ATOL iterations 0
        Linear fieldsplit_TEBV_fieldsplit_tau_ solve converged due to CONVERGED_RTOL iterations 1
          Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_EP_ solve converged due to CONVERGED_RTOL iterations 3
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Zero pivot in LU factorization: https://petsc.org/release/faq/#zeropivot<https://urldefense.com/v3/__https://petsc.org/release/faq/*zeropivot__;Iw!!HXCxUKc!1vY5FSHg9a143jpT3lhbQey_sldBBXUDcobIHEM7_pSCNpzEjSMIA-uaq6k_Ov9kiGJ9WBEKSca2QVY$>
[0]PETSC ERROR: Zero pivot in row 1658
[0]PETSC ERROR: See https://petsc.org/release/faq/<https://urldefense.com/v3/__https://petsc.org/release/faq/__;!!HXCxUKc!1vY5FSHg9a143jpT3lhbQey_sldBBXUDcobIHEM7_pSCNpzEjSMIA-uaq6k_Ov9kiGJ9WBEKSDe1jhs$> for trouble shooting.
[0]PETSC ERROR: Petsc Development GIT revision: v3.16.3-751-g2f43bd9bc3  GIT Date: 2022-01-26 22:34:02 -0600
[0]PETSC ERROR: ./main on a macx named pn2032683.lanl.gov<https://urldefense.com/v3/__http://pn2032683.lanl.gov/__;!!HXCxUKc!1vY5FSHg9a143jpT3lhbQey_sldBBXUDcobIHEM7_pSCNpzEjSMIA-uaq6k_Ov9kiGJ9WBEKW6FQ13g$> by zjorti Fri Jun 10 16:17:35 2022
[0]PETSC ERROR: Configure options PETSC_ARCH=macx --with-fc=0 --with-mpi-dir=/Users/zjorti/.brew --download-hypre --with-debugging=0 --with-cxx-dialect=C++11 --download-superlu_dist --download-parmetis --download-metis --download-ptscotch --download-cmake



Then I tried this flag -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_view_pmat binary:BVmat:binary_matlab and checked the resulting matrix in Matlab.
For Matlab, this is a full rank matrix, and the LU factorization there was carried out without any issues.
I also outputted the BV block directly from the Jacobian matrix.
Once again, according to Matlab, it is a full rank matrix and it computes the LU factorization without any problem.


________________________________
From: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>
Sent: Friday, June 10, 2022 7:32 AM
To: Jorti, Zakariae
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>; Tang, Xianzhu
Subject: [EXTERNAL] Re: [petsc-users] Question about SuperLU


  It is difficult to tell exactly how the preconditioner is being formed with the information below it looks like in the

first case: the original B diagonal block and V diagonal block of the matrix are being factored separately with SuperLU_DIST

second case: the B block is factored with SuperLU_DIST and an explicit approximation to a Schur complement of the V block (Schur complement on eliminating the B block) is formed using "Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses A00's %sdiagonal's inverse" (this is the printout from a KSPView() for this part of the preconditioner).

  My guess is you have a "Stokes"-like problem where the V block is identically 0 so, of course, the SuperLU_DIST will fail on it. But the approximation of the Schur complement onto that block is not singular so SuperLU_DIST has no trouble. If I am wrong and the V block is not identically 0 then it may be singular (or possibly, but less likely just badly order) so that SuperLU_DIST encounters a zero pivot.

  You can run with -ksp_view_pre to have the KSP print the KSP solver algorithm details BEFORE the linear solve (hence they would get printed despite your failed solve). That would be useful to see exactly what your preconditioner is.

   You can use -ksp_view_pmat (with appropriate prefix) to display the matrix that is going to be factored. Thus you can quickly verify what V is.

  If you run with -ksp_error_if_not_converged then the solver will stop exactly when the zero pivot is encountered; this would include some information from SuperLU_DIST which might include the row number etc.

  Notes on PETSc improvements needed.

1) The man page for KSPCheckSolve() is terribly misleading

2) It would be nice to have a view that displayed the nested fieldsplit preconditioners more clearly






On Jun 9, 2022, at 5:19 PM, Jorti, Zakariae via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:


Hi,

I am solving non-linear problem that has 5 unknowns {ni, T, E, B, V}, and for the preconditioning part, I am using a FieldSplit preconditioner. At the last fieldsplit/level, we are left with a {B,V} block that tried to precondition in 2 different ways:
a) SuperLU:
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type preonly -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type lu -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_factor_mat_solver_type superlu_dist
b) a Schur-based fieldsplit preconditioner that uses SuperLU for both V and B blocks:
-fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ksp_type gmres -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_type fieldsplit -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_type schur -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_pc_fieldsplit_schur_precondition selfp -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_ksp_type preonly -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_pc_type lu -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_B_pc_factor_mat_solver_type superlu_dist -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_ksp_type preonly -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_pc_type lu -fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_fieldsplit_V_pc_factor_mat_solver_type superlu_dist

Option a) yields the following error:
"     Linear fieldsplit_ni_ solve converged due to CONVERGED_ATOL iterations 0
        Linear fieldsplit_TEBV_fieldsplit_tau_ solve converged due to CONVERGED_RTOL iterations 1
          Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_EP_ solve converged due to CONVERGED_RTOL iterations 5
          Linear fieldsplit_TEBV_fieldsplit_EBV_fieldsplit_BV_ solve did not converge due to DIVERGED_PC_FAILED iterations 0
                         PC failed due to FACTOR_NUMERIC_ZEROPIVOT "
whereas options b) seems to be working well.
Is it possible that the SuperLU on the {V,B} block uses a reordering that introduces a zero pivot or could there be another explanation for this error?

Many thanks.
Best,

Zakariae

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220611/5fa441f3/attachment-0001.html>

From samuelestes91 at gmail.com  Sat Jun 11 19:32:20 2022
From: samuelestes91 at gmail.com (Samuel Estes)
Date: Sat, 11 Jun 2022 19:32:20 -0500
Subject: [petsc-users] Mat preallocation for adaptive grid
Message-ID: <CAOUB9XupeJ5tVyY2LNKYA1iYfZMj4bBqycvgAXsUGpbursP=Og@mail.gmail.com>

Hello,

My question concerns preallocation for Mats in adaptive FEM problems. When
the grid refines, I destroy the old matrix and create a new one of the
appropriate (larger size). When the grid ?un-refines? I just use the same
(extra large) matrix and pad the extra unused diagonal entries with 1?s.
The problem comes in with the preallocation. I use the MatPreallocator,
MatPreallocatorPreallocate() paradigm which requires a specific sparsity
pattern. When the grid un-refines, although the total number of nonzeros
allocated is (most likely) more than sufficient, the particular sparsity
pattern changes which leads to mallocs in the MatSetValues routines and
obviously I would like to avoid this.

One obvious solution is just to destroy and recreate the matrix any time
the grid changes, even if it gets smaller. By just using a new matrix every
time, I would avoid this problem although at the cost of having to rebuild
the matrix more often than necessary. This is the simplest solution from a
programming perspective and probably the one I will go with.

I'm just curious if there's an alternative that you would recommend?
Basically what I would like to do is to just change the sparsity pattern
that is created in the MatPreallocatorPreallocate() routine. I'm not sure
how it works under the hood, but in principle, it should be possible to
keep the memory allocated for the Mat values and just assign them new
column numbers and potentially add new nonzeros as well. Is there a
convenient way of doing this? One thought I had was to just fill in the
MatPreallocator object with the new sparsity pattern of the coarser mesh
and then call the MatPreallocatorPreallocate() routine again with the new
MatPreallocator matrix. I'm just not sure how exactly that would work since
it would have already been called for the FEM matrix for the previous,
finer grid.

Finally, does this really matter? I imagine the bottleneck (assuming good
preallocation) is in the solver so maybe it doesn't make much difference
whether or not I reuse the old matrix. In that case, going with option 1
and simply destroying and recreating the matrix would be the way to go just
to save myself some time.

I hope that my question is clear. If not, please let me know and I will
clarify. I am very curious if there's a convenient solution for the second
option I mentioned to recycle the allocated memory and redo the sparsity
pattern.

Thanks!

Sam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220611/44967144/attachment.html>

From knepley at gmail.com  Sat Jun 11 19:38:35 2022
From: knepley at gmail.com (Matthew Knepley)
Date: Sat, 11 Jun 2022 20:38:35 -0400
Subject: [petsc-users] Mat preallocation for adaptive grid
In-Reply-To: <CAOUB9XupeJ5tVyY2LNKYA1iYfZMj4bBqycvgAXsUGpbursP=Og@mail.gmail.com>
References: <CAOUB9XupeJ5tVyY2LNKYA1iYfZMj4bBqycvgAXsUGpbursP=Og@mail.gmail.com>
Message-ID: <CAMYG4G=x-TfwqwHkVe3fLRVWZSbgU2N5XSCbv276koRpsc9aCA@mail.gmail.com>

On Sat, Jun 11, 2022 at 8:32 PM Samuel Estes <samuelestes91 at gmail.com>
wrote:

> Hello,
>
> My question concerns preallocation for Mats in adaptive FEM problems. When
> the grid refines, I destroy the old matrix and create a new one of the
> appropriate (larger size). When the grid ?un-refines? I just use the same
> (extra large) matrix and pad the extra unused diagonal entries with 1?s.
> The problem comes in with the preallocation. I use the MatPreallocator,
> MatPreallocatorPreallocate() paradigm which requires a specific sparsity
> pattern. When the grid un-refines, although the total number of nonzeros
> allocated is (most likely) more than sufficient, the particular sparsity
> pattern changes which leads to mallocs in the MatSetValues routines and
> obviously I would like to avoid this.
>
> One obvious solution is just to destroy and recreate the matrix any time
> the grid changes, even if it gets smaller. By just using a new matrix every
> time, I would avoid this problem although at the cost of having to rebuild
> the matrix more often than necessary. This is the simplest solution from a
> programming perspective and probably the one I will go with.
>
> I'm just curious if there's an alternative that you would recommend?
> Basically what I would like to do is to just change the sparsity pattern
> that is created in the MatPreallocatorPreallocate() routine. I'm not sure
> how it works under the hood, but in principle, it should be possible to
> keep the memory allocated for the Mat values and just assign them new
> column numbers and potentially add new nonzeros as well. Is there a
> convenient way of doing this? One thought I had was to just fill in the
> MatPreallocator object with the new sparsity pattern of the coarser mesh
> and then call the MatPreallocatorPreallocate() routine again with the new
> MatPreallocator matrix. I'm just not sure how exactly that would work since
> it would have already been called for the FEM matrix for the previous,
> finer grid.
>
> Finally, does this really matter? I imagine the bottleneck (assuming good
> preallocation) is in the solver so maybe it doesn't make much difference
> whether or not I reuse the old matrix. In that case, going with option 1
> and simply destroying and recreating the matrix would be the way to go just
> to save myself some time.
>
> I hope that my question is clear. If not, please let me know and I will
> clarify. I am very curious if there's a convenient solution for the second
> option I mentioned to recycle the allocated memory and redo the sparsity
> pattern.
>

I have not run any tests of this kind of thing, so I cannot say
definitively.

I can say that I consider the reuse of memory a problem to be solved at
allocation time. You would hope that a good malloc system would give
you back the same memory you just freed when getting rid of the prior
matrix, so you would get the speedup you want using your approach.

Second, I think the allocation cost is likely to pale in comparison to the
cost of writing the matrix itself (passing all those indices and values
through
the memory bus), and so reuse of the memory is not that important (I think).

  Thanks,

      Matt


> Thanks!
>
> Sam
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220611/620cd7cb/attachment.html>

From samuelestes91 at gmail.com  Sat Jun 11 19:43:06 2022
From: samuelestes91 at gmail.com (Samuel Estes)
Date: Sat, 11 Jun 2022 19:43:06 -0500
Subject: [petsc-users] Mat preallocation for adaptive grid
In-Reply-To: <CAMYG4G=x-TfwqwHkVe3fLRVWZSbgU2N5XSCbv276koRpsc9aCA@mail.gmail.com>
References: <CAOUB9XupeJ5tVyY2LNKYA1iYfZMj4bBqycvgAXsUGpbursP=Og@mail.gmail.com>
	<CAMYG4G=x-TfwqwHkVe3fLRVWZSbgU2N5XSCbv276koRpsc9aCA@mail.gmail.com>
Message-ID: <CAOUB9Xu5YgOyxVTrrTcn8URNbn_fEPK7qsZOXvX93bM6ZMpTww@mail.gmail.com>

I'm sorry, would you mind clarifying? I think my email was so long and
rambling that it's tough for me to understand which part was being
answered.

On Sat, Jun 11, 2022 at 7:38 PM Matthew Knepley <knepley at gmail.com> wrote:

> On Sat, Jun 11, 2022 at 8:32 PM Samuel Estes <samuelestes91 at gmail.com>
> wrote:
>
>> Hello,
>>
>> My question concerns preallocation for Mats in adaptive FEM problems.
>> When the grid refines, I destroy the old matrix and create a new one of the
>> appropriate (larger size). When the grid ?un-refines? I just use the same
>> (extra large) matrix and pad the extra unused diagonal entries with 1?s.
>> The problem comes in with the preallocation. I use the MatPreallocator,
>> MatPreallocatorPreallocate() paradigm which requires a specific sparsity
>> pattern. When the grid un-refines, although the total number of nonzeros
>> allocated is (most likely) more than sufficient, the particular sparsity
>> pattern changes which leads to mallocs in the MatSetValues routines and
>> obviously I would like to avoid this.
>>
>> One obvious solution is just to destroy and recreate the matrix any time
>> the grid changes, even if it gets smaller. By just using a new matrix every
>> time, I would avoid this problem although at the cost of having to rebuild
>> the matrix more often than necessary. This is the simplest solution from a
>> programming perspective and probably the one I will go with.
>>
>> I'm just curious if there's an alternative that you would recommend?
>> Basically what I would like to do is to just change the sparsity pattern
>> that is created in the MatPreallocatorPreallocate() routine. I'm not sure
>> how it works under the hood, but in principle, it should be possible to
>> keep the memory allocated for the Mat values and just assign them new
>> column numbers and potentially add new nonzeros as well. Is there a
>> convenient way of doing this? One thought I had was to just fill in the
>> MatPreallocator object with the new sparsity pattern of the coarser mesh
>> and then call the MatPreallocatorPreallocate() routine again with the new
>> MatPreallocator matrix. I'm just not sure how exactly that would work since
>> it would have already been called for the FEM matrix for the previous,
>> finer grid.
>>
>> Finally, does this really matter? I imagine the bottleneck (assuming good
>> preallocation) is in the solver so maybe it doesn't make much difference
>> whether or not I reuse the old matrix. In that case, going with option 1
>> and simply destroying and recreating the matrix would be the way to go just
>> to save myself some time.
>>
>> I hope that my question is clear. If not, please let me know and I will
>> clarify. I am very curious if there's a convenient solution for the second
>> option I mentioned to recycle the allocated memory and redo the sparsity
>> pattern.
>>
>
> I have not run any tests of this kind of thing, so I cannot say
> definitively.
>
> I can say that I consider the reuse of memory a problem to be solved at
> allocation time. You would hope that a good malloc system would give
> you back the same memory you just freed when getting rid of the prior
> matrix, so you would get the speedup you want using your approach.
>

What do you mean by "your approach"? Do you mean the first option where I
just always destroy the matrix? Are you basically saying that when I
destroy the old matrix and create a new one, it should just give me the
same block of memory that was just freed by the destruction of the previous
one?

>
> Second, I think the allocation cost is likely to pale in comparison to the
> cost of writing the matrix itself (passing all those indices and values
> through
> the memory bus), and so reuse of the memory is not that important (I
> think).
>

This seems to suggest that the best option is just to destroy and recreate
and not worry about "re-preallocating". Do I understand that correctly?

>
>   Thanks,
>
>       Matt
>
>
>> Thanks!
>>
>> Sam
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220611/7cbc0673/attachment-0001.html>

From knepley at gmail.com  Sat Jun 11 19:54:47 2022
From: knepley at gmail.com (Matthew Knepley)
Date: Sat, 11 Jun 2022 20:54:47 -0400
Subject: [petsc-users] Mat preallocation for adaptive grid
In-Reply-To: <CAOUB9Xu5YgOyxVTrrTcn8URNbn_fEPK7qsZOXvX93bM6ZMpTww@mail.gmail.com>
References: <CAOUB9XupeJ5tVyY2LNKYA1iYfZMj4bBqycvgAXsUGpbursP=Og@mail.gmail.com>
	<CAMYG4G=x-TfwqwHkVe3fLRVWZSbgU2N5XSCbv276koRpsc9aCA@mail.gmail.com>
	<CAOUB9Xu5YgOyxVTrrTcn8URNbn_fEPK7qsZOXvX93bM6ZMpTww@mail.gmail.com>
Message-ID: <CAMYG4G=yYJgU+jrCQDbF-iOEMG+9gDtRxnCbjcn7nneykLzDXg@mail.gmail.com>

On Sat, Jun 11, 2022 at 8:43 PM Samuel Estes <samuelestes91 at gmail.com>
wrote:

> I'm sorry, would you mind clarifying? I think my email was so long and
> rambling that it's tough for me to understand which part was being
> answered.
>
> On Sat, Jun 11, 2022 at 7:38 PM Matthew Knepley <knepley at gmail.com> wrote:
>
>> On Sat, Jun 11, 2022 at 8:32 PM Samuel Estes <samuelestes91 at gmail.com>
>> wrote:
>>
>>> Hello,
>>>
>>> My question concerns preallocation for Mats in adaptive FEM problems.
>>> When the grid refines, I destroy the old matrix and create a new one of the
>>> appropriate (larger size). When the grid ?un-refines? I just use the same
>>> (extra large) matrix and pad the extra unused diagonal entries with 1?s.
>>> The problem comes in with the preallocation. I use the MatPreallocator,
>>> MatPreallocatorPreallocate() paradigm which requires a specific sparsity
>>> pattern. When the grid un-refines, although the total number of nonzeros
>>> allocated is (most likely) more than sufficient, the particular sparsity
>>> pattern changes which leads to mallocs in the MatSetValues routines and
>>> obviously I would like to avoid this.
>>>
>>> One obvious solution is just to destroy and recreate the matrix any time
>>> the grid changes, even if it gets smaller. By just using a new matrix every
>>> time, I would avoid this problem although at the cost of having to rebuild
>>> the matrix more often than necessary. This is the simplest solution from a
>>> programming perspective and probably the one I will go with.
>>>
>>> I'm just curious if there's an alternative that you would recommend?
>>> Basically what I would like to do is to just change the sparsity pattern
>>> that is created in the MatPreallocatorPreallocate() routine. I'm not sure
>>> how it works under the hood, but in principle, it should be possible to
>>> keep the memory allocated for the Mat values and just assign them new
>>> column numbers and potentially add new nonzeros as well. Is there a
>>> convenient way of doing this? One thought I had was to just fill in the
>>> MatPreallocator object with the new sparsity pattern of the coarser mesh
>>> and then call the MatPreallocatorPreallocate() routine again with the new
>>> MatPreallocator matrix. I'm just not sure how exactly that would work since
>>> it would have already been called for the FEM matrix for the previous,
>>> finer grid.
>>>
>>> Finally, does this really matter? I imagine the bottleneck (assuming
>>> good preallocation) is in the solver so maybe it doesn't make much
>>> difference whether or not I reuse the old matrix. In that case, going with
>>> option 1 and simply destroying and recreating the matrix would be the way
>>> to go just to save myself some time.
>>>
>>> I hope that my question is clear. If not, please let me know and I will
>>> clarify. I am very curious if there's a convenient solution for the second
>>> option I mentioned to recycle the allocated memory and redo the sparsity
>>> pattern.
>>>
>>
>> I have not run any tests of this kind of thing, so I cannot say
>> definitively.
>>
>> I can say that I consider the reuse of memory a problem to be solved at
>> allocation time. You would hope that a good malloc system would give
>> you back the same memory you just freed when getting rid of the prior
>> matrix, so you would get the speedup you want using your approach.
>>
>
> What do you mean by "your approach"? Do you mean the first option where I
> just always destroy the matrix? Are you basically saying that when I
> destroy the old matrix and create a new one, it should just give me the
> same block of memory that was just freed by the destruction of the previous
> one?
>

Yes.


>
>> Second, I think the allocation cost is likely to pale in comparison to
>> the cost of writing the matrix itself (passing all those indices and values
>> through
>> the memory bus), and so reuse of the memory is not that important (I
>> think).
>>
>
> This seems to suggest that the best option is just to destroy and recreate
> and not worry about "re-preallocating". Do I understand that correctly?
>

Yes.

  Thanks,

     Matt


>
>>   Thanks,
>>
>>       Matt
>>
>>
>>> Thanks!
>>>
>>> Sam
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220611/b80697c4/attachment.html>

From samuelestes91 at gmail.com  Sat Jun 11 20:24:19 2022
From: samuelestes91 at gmail.com (Samuel Estes)
Date: Sat, 11 Jun 2022 20:24:19 -0500
Subject: [petsc-users] Mat preallocation for adaptive grid
In-Reply-To: <CAMYG4G=yYJgU+jrCQDbF-iOEMG+9gDtRxnCbjcn7nneykLzDXg@mail.gmail.com>
References: <CAOUB9XupeJ5tVyY2LNKYA1iYfZMj4bBqycvgAXsUGpbursP=Og@mail.gmail.com>
	<CAMYG4G=x-TfwqwHkVe3fLRVWZSbgU2N5XSCbv276koRpsc9aCA@mail.gmail.com>
	<CAOUB9Xu5YgOyxVTrrTcn8URNbn_fEPK7qsZOXvX93bM6ZMpTww@mail.gmail.com>
	<CAMYG4G=yYJgU+jrCQDbF-iOEMG+9gDtRxnCbjcn7nneykLzDXg@mail.gmail.com>
Message-ID: <CAOUB9Xv5PBjhqMGXUd335RAS6oFw=5wPVotfPJv7ffPu=NHC=Q@mail.gmail.com>

Ok thanks so much for the help! It's nice that it coincides with the
easiest option!

On Sat, Jun 11, 2022 at 7:54 PM Matthew Knepley <knepley at gmail.com> wrote:

> On Sat, Jun 11, 2022 at 8:43 PM Samuel Estes <samuelestes91 at gmail.com>
> wrote:
>
>> I'm sorry, would you mind clarifying? I think my email was so long and
>> rambling that it's tough for me to understand which part was being
>> answered.
>>
>> On Sat, Jun 11, 2022 at 7:38 PM Matthew Knepley <knepley at gmail.com>
>> wrote:
>>
>>> On Sat, Jun 11, 2022 at 8:32 PM Samuel Estes <samuelestes91 at gmail.com>
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> My question concerns preallocation for Mats in adaptive FEM problems.
>>>> When the grid refines, I destroy the old matrix and create a new one of the
>>>> appropriate (larger size). When the grid ?un-refines? I just use the same
>>>> (extra large) matrix and pad the extra unused diagonal entries with 1?s.
>>>> The problem comes in with the preallocation. I use the MatPreallocator,
>>>> MatPreallocatorPreallocate() paradigm which requires a specific sparsity
>>>> pattern. When the grid un-refines, although the total number of nonzeros
>>>> allocated is (most likely) more than sufficient, the particular sparsity
>>>> pattern changes which leads to mallocs in the MatSetValues routines and
>>>> obviously I would like to avoid this.
>>>>
>>>> One obvious solution is just to destroy and recreate the matrix any
>>>> time the grid changes, even if it gets smaller. By just using a new matrix
>>>> every time, I would avoid this problem although at the cost of having to
>>>> rebuild the matrix more often than necessary. This is the simplest solution
>>>> from a programming perspective and probably the one I will go with.
>>>>
>>>> I'm just curious if there's an alternative that you would recommend?
>>>> Basically what I would like to do is to just change the sparsity pattern
>>>> that is created in the MatPreallocatorPreallocate() routine. I'm not sure
>>>> how it works under the hood, but in principle, it should be possible to
>>>> keep the memory allocated for the Mat values and just assign them new
>>>> column numbers and potentially add new nonzeros as well. Is there a
>>>> convenient way of doing this? One thought I had was to just fill in the
>>>> MatPreallocator object with the new sparsity pattern of the coarser mesh
>>>> and then call the MatPreallocatorPreallocate() routine again with the new
>>>> MatPreallocator matrix. I'm just not sure how exactly that would work since
>>>> it would have already been called for the FEM matrix for the previous,
>>>> finer grid.
>>>>
>>>> Finally, does this really matter? I imagine the bottleneck (assuming
>>>> good preallocation) is in the solver so maybe it doesn't make much
>>>> difference whether or not I reuse the old matrix. In that case, going with
>>>> option 1 and simply destroying and recreating the matrix would be the way
>>>> to go just to save myself some time.
>>>>
>>>> I hope that my question is clear. If not, please let me know and I will
>>>> clarify. I am very curious if there's a convenient solution for the second
>>>> option I mentioned to recycle the allocated memory and redo the sparsity
>>>> pattern.
>>>>
>>>
>>> I have not run any tests of this kind of thing, so I cannot say
>>> definitively.
>>>
>>> I can say that I consider the reuse of memory a problem to be solved at
>>> allocation time. You would hope that a good malloc system would give
>>> you back the same memory you just freed when getting rid of the prior
>>> matrix, so you would get the speedup you want using your approach.
>>>
>>
>> What do you mean by "your approach"? Do you mean the first option where I
>> just always destroy the matrix? Are you basically saying that when I
>> destroy the old matrix and create a new one, it should just give me the
>> same block of memory that was just freed by the destruction of the previous
>> one?
>>
>
> Yes.
>
>
>>
>>> Second, I think the allocation cost is likely to pale in comparison to
>>> the cost of writing the matrix itself (passing all those indices and values
>>> through
>>> the memory bus), and so reuse of the memory is not that important (I
>>> think).
>>>
>>
>> This seems to suggest that the best option is just to destroy and
>> recreate and not worry about "re-preallocating". Do I understand that
>> correctly?
>>
>
> Yes.
>
>   Thanks,
>
>      Matt
>
>
>>
>>>   Thanks,
>>>
>>>       Matt
>>>
>>>
>>>> Thanks!
>>>>
>>>> Sam
>>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://www.cse.buffalo.edu/~knepley/
>>> <http://www.cse.buffalo.edu/~knepley/>
>>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220611/b5b275dd/attachment-0001.html>

From jroman at dsic.upv.es  Sun Jun 12 03:07:59 2022
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Sun, 12 Jun 2022 10:07:59 +0200
Subject: [petsc-users] SLEPc EPSGD: too much time in single iteration
In-Reply-To: <CAMhO6e3Xy5iWVWTq-4=OAHn8KZKkzF_G24ad5F2TWqZXuJio0A@mail.gmail.com>
References: <FD844D29-1CB7-4E1D-AEC6-80F3DAB9F670@dsic.upv.es>
	<EDBC719A-0974-446B-ADD2-0F101CDB691F@getmailspring.com>
	<0FE9EE1A-2CD4-44A2-ACD0-F6F4A466457A@dsic.upv.es>
	<CAMhO6e2GeOugEVtSyq1kJszr=CZ=3o7mhTPF-TMGo+rRcuc=vQ@mail.gmail.com>
	<CAMhO6e3Xy5iWVWTq-4=OAHn8KZKkzF_G24ad5F2TWqZXuJio0A@mail.gmail.com>
Message-ID: <873EFD58-3EB4-4FF5-97B7-E2AE8A60D17B@dsic.upv.es>

Please always respond to the list.

Pay attention to the warnings in the log:

      ##########################################################
      #                                                        #
      #                       WARNING!!!                       #
      #                                                        #
      #   This code was compiled with a debugging option.      #
      #   To get timing results run ./configure                #
      #   using --with-debugging=no, the performance will      #
      #   be generally two or three times faster.              #
      #                                                        #
      ##########################################################

With the debugging option the times are not trustworthy, so I suggest repeating the analysis with an optimized build.

Jose


> El 12 jun 2022, a las 5:41, Runfeng Jin <jsfaraway at gmail.com> escribi?:
> 
> Hello!
>  I compare these two matrix solver's log view and find some strange thing. Attachment files are the log view.:
>    file 1:  log of matrix A solver. This is a larger matrix(900,000*900,000) but solved quickly(30s);
>    file 2: log of matix B solver. This is a smaller matrix(2,547*2,547 , a little different from the matrix B that is mentioned in initial email, but solved much slower too. I use this for a quicker test) but solved much slower(1244s).
> 
> By comparing these two files, I find some thing:
> 1) Matrix A has more basis vectors(375) than B(189), but A spent less time on BVCreate(0.349s) than B(296s);
> 2) Matrix A spent less time on EPSSetup(0.031s) than B(10.709s)
> 3) Matrix B distribute much more unbalancedly storage among processors(memory max/min 4365) than A(memory max/min 1.113), but other metrics seems more balanced.
> 
> I don't do prealocation in A, and it is distributed across processors by PETSc. For B , when preallocation I use PetscSplitOwnership to decide which part belongs to local processor, and B is also distributed by PETSc when compute matrix values. 
> 
> - Does this mean, for matrix B, too much nonzero elements are stored in single process, and this is why it cost too much more time in solving the matrix and find eigenvalues? If so,  are there some better ways to distribute the matrix among processors?  
> - Or are there any else reasons for this difference in cost time?
> 
> Hope to recieve your reply, thank you!
> 
> Runfeng Jin
> 
> 
> 
> Runfeng Jin <jsfaraway at gmail.com> ?2022?6?11??? 20:33???
> Hello!
> I have try ues PETSC_DEFAULT for eps_ncv, but it still cost much time. Is there anything else I can do? Attachment is log when use PETSC_DEFAULT for eps_ncv.
> 
> Thank you !
> 
> Runfeng Jin
> 
> Jose E. Roman <jroman at dsic.upv.es> ?2022?6?10??? 20:50???
> The value -eps_ncv 5000 is huge.
> Better let SLEPc use the default value.
> 
> Jose
> 
> 
> > El 10 jun 2022, a las 14:24, Jin Runfeng <jsfaraway at gmail.com> escribi?:
> > 
> > Hello!
> >  I want to acquire the 3 smallest eigenvalue, and attachment is the log  view output. I can see epssolve really cost the major time. But I can not see why it cost so much time. Can you see something from it?
> > 
> > Thank you !
> > 
> > Runfeng Jin
> > 
> > On 6? 4 2022, at 1:37 ??, Jose E. Roman <jroman at dsic.upv.es> wrote:
> > Convergence depends on distribution of eigenvalues you want to compute. On the other hand, the cost also depends on the time it takes to build the preconditioner. Use -log_view to see the cost of the different steps of the computation.
> > 
> > Jose
> > 
> > 
> > > El 3 jun 2022, a las 18:50, jsfaraway <jsfaraway at gmail.com> escribi?:
> > >
> > > hello!
> > >
> > > I am trying to use epsgd compute matrix's one smallest eigenvalue. And I find a strang thing. There are two matrix A(900000*900000) and B(90000*90000). While solve A use 371 iterations and only 30.83s, solve B use 22 iterations and 38885s! What could be the reason for this? Or what can I do to find the reason?
> > >
> > > I use" -eps_type gd -eps_ncv 300 -eps_nev 3 -eps_smallest_real ".
> > > And there is one difference I can tell is matrix B has many small value, whose absolute value is less than 10-6. Could this be the reason?
> > >
> > > Thank you!
> > >
> > > Runfeng Jin
> > <log_view.txt>
> 
> <File2_lower-But-Smaller-Matrix.txt><File1_fatesr-But-Larger-MATRIX.txt>


From sami.ben-elhaj-salah at ensma.fr  Sun Jun 12 09:48:44 2022
From: sami.ben-elhaj-salah at ensma.fr (Sami BEN ELHAJ SALAH)
Date: Sun, 12 Jun 2022 16:48:44 +0200
Subject: [petsc-users] Writing VTK output
In-Reply-To: <CAMYG4GkjWD8=FWub8kFOcQZnU8gKW6_udB8Nt=t8J+TyTjoeCw@mail.gmail.com>
References: <D72073D3-62B5-4151-9B11-E3ED40F32B01@ensma.fr>
	<87czfje0ol.fsf@jedbrown.org>
	<CDC93C32-305A-42FD-881A-5548E2E1C532@ensma.fr>
	<875ylbdyfk.fsf@jedbrown.org>
	<7A2FB4C2-CA56-4D7D-9162-4574E14032C6@ensma.fr>
	<CAMYG4GkjWD8=FWub8kFOcQZnU8gKW6_udB8Nt=t8J+TyTjoeCw@mail.gmail.com>
Message-ID: <EF26BAEA-EED3-4689-AACB-8DB0AC3D79C1@ensma.fr>

Dear Matthew and Jed,
Thank you very much for explaining and your help. I am sorry for my late reply. 

For me, the .vtu file is wrong when the <AppendedData> section seems to be not correct (I mean the raw encoding because when I visualize the .vtu file on paraview, the geometry is not good). The header is OK (see attached file). To generate the vtu file, I use the routine suggested by Matthew and the commande line proposed by Jed (-dm_plex_filename 2C3D8_msh.msh -dm_view vtk:2C3D8_msh.vtu).

On the other hand,  when I use the routine below and write my output to a vtk file and not vtu, the result is ok except the rotation of the elements nodes (the nodes rotation is not good for me and not saved comparing to gmsh file). 

PetscViewer vtk;
PetscViewerVTKOpen(FOX::Parallel::COMM_WORLD,"solution.vtk",FILE_MODE_WRITE,&vtk);
VecView(solution,vtk);
PetscViewerDestroy(&vtk);
I put here an example of a vtk file that I have generated 
# vtk DataFile Version 2.0
Simplicial Mesh Example
ASCII
DATASET UNSTRUCTURED_GRID
POINTS 12 double
0.000000e+00 1.000000e+01 1.000000e+01
0.000000e+00 0.000000e+00 1.000000e+01
0.000000e+00 0.000000e+00 0.000000e+00
0.000000e+00 1.000000e+01 0.000000e+00
1.000000e+01 1.000000e+01 1.000000e+01
1.000000e+01 0.000000e+00 1.000000e+01
1.000000e+01 0.000000e+00 0.000000e+00
1.000000e+01 1.000000e+01 0.000000e+00
2.000000e+01 1.000000e+01 1.000000e+01
2.000000e+01 0.000000e+00 1.000000e+01
2.000000e+01 0.000000e+00 0.000000e+00
2.000000e+01 1.000000e+01 0.000000e+00
CELLS 2 18
8  0 3 2 1 4 5 6 7
8  4 7 6 5 8 9 10 11
CELL_TYPES 2
12
12
POINT_DATA 12
VECTORS dU_x double
2.754808e-10 -8.653846e-11 -8.653846e-11
2.754808e-10 8.653846e-11 -8.653846e-11
2.754808e-10 8.653846e-11 8.653846e-11
2.754808e-10 -8.653846e-11 8.653846e-11
4.678571e-01 -9.107143e-02 -9.107143e-02
4.678571e-01 9.107143e-02 -9.107143e-02
4.678571e-01 9.107143e-02 9.107143e-02
4.678571e-01 -9.107143e-02 9.107143e-02
1.000000e+00 -7.500000e-02 -7.500000e-02
1.000000e+00 7.500000e-02 -7.500000e-02
1.000000e+00 7.500000e-02 7.500000e-02
1.000000e+00 -7.500000e-02 7.500000e-02

To obtain the good geometry, the two lines 
8  0 3 2 1 4 5 6 7
8  4 7 6 5 8 9 10 11
 Should be like this in order to have a good geometry defined in the gmsh file.
8  0 1 2 3 4 5 6 7
8  4 5 6 7 8 9 10 11

- - - > So I m trying now to compile my code with petsc 3.16, may be it solves the problem of the rotation order of nodes.

Thank you and have a good day,

Sami,


--
Dr. Sami BEN ELHAJ SALAH
Ing?nieur de Recherche (CNRS)
Institut Pprime - ISAE - ENSMA
Mobile: 06.62.51.26.74
Email: sami.ben-elhaj-salah at ensma.fr
www.samibenelhajsalah.com <https://samiben91.github.io/samibenelhajsalah/index.html>



> Le 8 juin 2022 ? 17:57, Matthew Knepley <knepley at gmail.com> a ?crit :
> 
> On Wed, Jun 8, 2022 at 11:24 AM Sami BEN ELHAJ SALAH <sami.ben-elhaj-salah at ensma.fr <mailto:sami.ben-elhaj-salah at ensma.fr>> wrote:
> Yes, the file "sami.vtu" is loaded correctly in paraview and I have the good output like you.
> 
> In my code, I tried with the same command given in your last answer and I still have the wrong .vtu file.
> 
> Hi Sami,
> 
> What do you mean by wrong?
> 
> Can you just use the simple procedure:
> 
>   PetscCall(DMCreate(comm, dm));
>   PetscCall(DMSetType(*dm, DMPLEX));
>   PetscCall(DMSetFromOptions(*dm));
>   PetscCall(DMViewFromOptions(*dm, NULL, "-dm_view"));
> 
> This is the one that works for us. Then we can change it in your code one step at a time until you get what you need.
> 
>   Thanks,
> 
>     Matt
>  
> I use this:
> mpirun -np 1 /home/benelhasa/fox_petsc/build_test/bin/Debug/FoXtroT -snes_test_jacobian_view -snes_converged_reason -snes_monitor -ksp_monitor -ksp_xmonitor -dm_plex_filename cub_2C3D8_msh.msh -dm_view vtk:cub_2C3D8_msh.vtu cub_8C3D8.fxt
> 
> 
> Thanks,
> Sami,
> 
> --
> Dr. Sami BEN ELHAJ SALAH
> Ing?nieur de Recherche (CNRS)
> Institut Pprime - ISAE - ENSMA
> Mobile: 06.62.51.26.74
> Email: sami.ben-elhaj-salah at ensma.fr <mailto:sami.ben-elhaj-salah at ensma.fr>
> www.samibenelhajsalah.com <https://samiben91.github.io/samibenelhajsalah/index.html>
> 
> 
> 
>> Le 8 juin 2022 ? 16:25, Jed Brown <jed at jedbrown.org <mailto:jed at jedbrown.org>> a ?crit :
>> 
>> Does the file load in paraview? When I load your *.msh in a tutorial with -dm_plex_filename sami.msh -dm_view vtk:sami.vtu, I get this good output.
>> 
>> <sami.vtu><sami.png>
>> Sami BEN ELHAJ SALAH <sami.ben-elhaj-salah at ensma.fr <mailto:sami.ben-elhaj-salah at ensma.fr>> writes:
>> 
>>> Hi Jed,
>>> 
>>> Thank you for your answer.
>>> 
>>> When I use a ??solution.vtu'', I obtain a wrong file. 
>>> 
>>> <?xml version="1.0"?>
>>> <VTKFile type="UnstructuredGrid" version="0.1" byte_order="LittleEndian">
>>> <UnstructuredGrid>
>>> <Piece NumberOfPoints="12" NumberOfCells="2">
>>> <Points>
>>> <DataArray type="Float64" Name="Position" NumberOfComponents="3" format="appended" offset="0" />
>>> </Points>
>>> <Cells>
>>> <DataArray type="Int32" Name="connectivity" NumberOfComponents="1" format="appended" offset="292" />
>>> <DataArray type="Int32" Name="offsets" NumberOfComponents="1" format="appended" offset="360" />
>>> <DataArray type="UInt8" Name="types" NumberOfComponents="1" format="appended" offset="372" />
>>> </Cells>
>>> <CellData>
>>> <DataArray type="Int32" Name="Rank" NumberOfComponents="1" format="appended" offset="378" />
>>> </CellData>
>>> <PointData>
>>> <DataArray type="Float64" Name="dU_x(null)" NumberOfComponents="3" format="appended" offset="390" />
>>> </PointData>
>>> </Piece>
>>> </UnstructuredGrid>
>>> <AppendedData encoding="raw">
>>> _ $@$@$@$@$@$@$@$@$@$@$@$@4@$@$@4@$@4 at 4@$@@  	
>>> ?p?O??=??sT?????sT????p?O??=??sT???=??sT????p?O??=??sT???=??sT???=?p?O??=??sT?????sT???=o _????? uP???? uP??o _????? uP???? uP??o _????? uP???? uP??o _????? uP???? uP??b#???????333????333??_#??????	?333????333??b#??????(?333??'?333??a#???????333??>?333??
>>> </AppendedData>
>>> </VTKFile>
>>> 
>>> 
>>> If I understand your answer, to solve my problem, should just upgrade all my software ?
>>> 
>>> Thanks,
>>> Sami,
>>> 
>>> 
>>> --
>>> Dr. Sami BEN ELHAJ SALAH
>>> Ing?nieur de Recherche (CNRS)
>>> Institut Pprime - ISAE - ENSMA
>>> Mobile: 06.62.51.26.74
>>> Email: sami.ben-elhaj-salah at ensma.fr <mailto:sami.ben-elhaj-salah at ensma.fr>
>>> www.samibenelhajsalah.com <http://www.samibenelhajsalah.com/> <https://samiben91.github.io/samibenelhajsalah/index.html <https://samiben91.github.io/samibenelhajsalah/index.html>>
>>> 
>>> 
>>> 
>>>> Le 8 juin 2022 ? 15:37, Jed Brown <jed at jedbrown.org <mailto:jed at jedbrown.org>> a ?crit :
>>>> 
>>>> You're using pretty old versions of all software; I'd recommend upgrading. I recommend choosing the file name "solution.vtu" to use the modern (non-legacy) format. Does that work for you?
>>>> 
>>>> Sami BEN ELHAJ SALAH <sami.ben-elhaj-salah at ensma.fr <mailto:sami.ben-elhaj-salah at ensma.fr>> writes:
>>>> 
>>>>> Dear Petsc Developer team,
>>>>> 
>>>>> I solved a linear elastic problem in 3D using a DMPLEX. My system is converging, then I would like to write out my solution vector to a vtk file where I use unstructured mesh. Currently, I tried two algorithms and I have the same result.
>>>>> 
>>>>> 1) Algorithm 1 
>>>>> err = SNESSolve(_snes, bc_vec_test, solution);
>>>>> CHKERRABORT(FOX::Parallel::COMM_WORLD,err);
>>>>> PetscViewer vtk; 
>>>>> PetscViewerVTKOpen(FOX::Parallel::COMM_WORLD,"solution.vtk",FILE_MODE_WRITE,&vtk); 
>>>>> VecView(solution,vtk);
>>>>> PetscViewerDestroy(&vtk);
>>>>> 
>>>>> 
>>>>> 2) Algorithm 2
>>>>> err = SNESSolve(_snes, bc_vec_test, solution);
>>>>> CHKERRABORT(FOX::Parallel::COMM_WORLD,err);
>>>>> PetscViewer vtk; 
>>>>> PetscViewerCreate(FOX::Parallel::COMM_WORLD, &vtk); 
>>>>> PetscViewerSetType(vtk, PETSCVIEWERVTK); 
>>>>> PetscViewerFileSetName(vtk, "sol.vtk"); 
>>>>> VecView(solution, vtk); 
>>>>> PetscViewerDestroy(&vtk);
>>>>> 
>>>>> The result seems correct except for the rotation order of the nodes (see the red lines on gmsh and vtk file respectively). Then, I visualized my vtk file with paraview, and I remarked that my geometry is not correct and not conserved when comparing it with my gmsh file. So, I didn?t understand why the rotation order of nodes is not conserved when saving my result to a vtk file?
>>>>> 
>>>>> Other information used:
>>>>> - gmsh format 2.2 
>>>>> - Vtk version: 7.1.1
>>>>> - Petsc version: 3.13/opt 
>>>>> 
>>>>> Below my two files gmsh and vtk:
>>>>> 
>>>>> Gmsh file:
>>>>> $MeshFormat
>>>>> 2.2 0 8
>>>>> $EndMeshFormat
>>>>> $Nodes
>>>>> 12
>>>>> 1 0.0 10.0 10.0
>>>>> 2 0.0 0.0 10.0
>>>>> 3 0.0 0.0 0.0
>>>>> 4 0.0 10.0 0.0
>>>>> 5 10.0 10.0 10.0
>>>>> 6 10.0 0.0 10.0
>>>>> 7 10.0 0.0 0.0
>>>>> 8 10.0 10.0 0.0
>>>>> 9 20.0 10.0 10.0
>>>>> 10 20.0 0.0 10.0
>>>>> 11 20.0 0.0 0.0
>>>>> 12 20.0 10.0 0.0
>>>>> $EndNodes
>>>>> $Elements
>>>>> 2
>>>>> 1 5 2 68 60 1 2 3 4 5 6 7 8
>>>>> 2 5 2 68 60 5 6 7 8 9 10 11 12
>>>>> $EndElements
>>>>> 
>>>>> Vtk file :
>>>>> # vtk DataFile Version 2.0
>>>>> Simplicial Mesh Example
>>>>> ASCII
>>>>> DATASET UNSTRUCTURED_GRID
>>>>> POINTS 12 double
>>>>> 0.000000e+00 1.000000e+01 1.000000e+01
>>>>> 0.000000e+00 0.000000e+00 1.000000e+01
>>>>> 0.000000e+00 0.000000e+00 0.000000e+00
>>>>> 0.000000e+00 1.000000e+01 0.000000e+00
>>>>> 1.000000e+01 1.000000e+01 1.000000e+01
>>>>> 1.000000e+01 0.000000e+00 1.000000e+01
>>>>> 1.000000e+01 0.000000e+00 0.000000e+00
>>>>> 1.000000e+01 1.000000e+01 0.000000e+00
>>>>> 2.000000e+01 1.000000e+01 1.000000e+01
>>>>> 2.000000e+01 0.000000e+00 1.000000e+01
>>>>> 2.000000e+01 0.000000e+00 0.000000e+00
>>>>> 2.000000e+01 1.000000e+01 0.000000e+00
>>>>> CELLS 2 18
>>>>> 8 0 3 2 1 4 5 6 7
>>>>> 8 4 7 6 5 8 9 10 11
>>>>> CELL_TYPES 2
>>>>> 12
>>>>> 12
>>>>> POINT_DATA 12
>>>>> VECTORS dU_x double
>>>>> 2.754808e-10 -8.653846e-11 -8.653846e-11
>>>>> 2.754808e-10 8.653846e-11 -8.653846e-11
>>>>> 2.754808e-10 8.653846e-11 8.653846e-11
>>>>> 2.754808e-10 -8.653846e-11 8.653846e-11
>>>>> 4.678571e-01 -9.107143e-02 -9.107143e-02
>>>>> 4.678571e-01 9.107143e-02 -9.107143e-02
>>>>> 4.678571e-01 9.107143e-02 9.107143e-02
>>>>> 4.678571e-01 -9.107143e-02 9.107143e-02
>>>>> 1.000000e+00 -7.500000e-02 -7.500000e-02
>>>>> 1.000000e+00 7.500000e-02 -7.500000e-02
>>>>> 1.000000e+00 7.500000e-02 7.500000e-02
>>>>> 1.000000e+00 -7.500000e-02 7.500000e-02
>>>>> 
>>>>> Thank you in advance and have a good day !
>>>>> 
>>>>> Sami,
>>>>> 
>>>>> --
>>>>> Dr. Sami BEN ELHAJ SALAH
>>>>> Ing?nieur de Recherche (CNRS)
>>>>> Institut Pprime - ISAE - ENSMA
>>>>> Mobile: 06.62.51.26.74
>>>>> Email: sami.ben-elhaj-salah at ensma.fr <mailto:sami.ben-elhaj-salah at ensma.fr>
>>>>> www.samibenelhajsalah.com <http://www.samibenelhajsalah.com/> <https://samiben91.github.io/samibenelhajsalah/index.html <https://samiben91.github.io/samibenelhajsalah/index.html>>
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220612/786878d1/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 2C3D8.vtu
Type: application/octet-stream
Size: 1319 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220612/786878d1/attachment-0001.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220612/786878d1/attachment-0003.html>

From knepley at gmail.com  Mon Jun 13 08:18:56 2022
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 13 Jun 2022 09:18:56 -0400
Subject: [petsc-users] Writing VTK output
In-Reply-To: <EF26BAEA-EED3-4689-AACB-8DB0AC3D79C1@ensma.fr>
References: <D72073D3-62B5-4151-9B11-E3ED40F32B01@ensma.fr>
	<87czfje0ol.fsf@jedbrown.org>
	<CDC93C32-305A-42FD-881A-5548E2E1C532@ensma.fr>
	<875ylbdyfk.fsf@jedbrown.org>
	<7A2FB4C2-CA56-4D7D-9162-4574E14032C6@ensma.fr>
	<CAMYG4GkjWD8=FWub8kFOcQZnU8gKW6_udB8Nt=t8J+TyTjoeCw@mail.gmail.com>
	<EF26BAEA-EED3-4689-AACB-8DB0AC3D79C1@ensma.fr>
Message-ID: <CAMYG4GkQPfEpmjcwBiirLgE32zTG_GmLOkBB2ZoHJBxYVJSi4w@mail.gmail.com>

Can you just send your GMsh file so I can see what you are asking for?

Also, Plex stores hexes with outward normals, but some other programs store
them with some inward normals. This
should be converted in the output. I can check this if you send your mesh.

  Thanks,

     Matt

On Sun, Jun 12, 2022 at 10:48 AM Sami BEN ELHAJ SALAH <
sami.ben-elhaj-salah at ensma.fr> wrote:

> Dear Matthew and Jed,
>
> Thank you very much for explaining and your help. I am sorry for my late
> reply.
> For me, the .vtu file is wrong when the <AppendedData> section seems to be
> not correct (I mean the raw encoding because when I visualize the .vtu file
> on paraview, the geometry is not good). The header is OK (see attached
> file). To generate the vtu file, I use the routine suggested by Matthew and
> the commande line proposed by Jed (-dm_plex_filename 2C3D8_msh.msh -dm_view
> vtk:2C3D8_msh.vtu).
>
> On the other hand,  when I use the routine below and write my output to a
> vtk file and not vtu, the result is ok except the rotation of the elements
> nodes (the nodes rotation is not good for me and not saved comparing to
> gmsh file).
>
> PetscViewer vtk;
> PetscViewerVTKOpen(FOX::Parallel::COMM_WORLD,"solution.vtk",FILE_MODE_WRITE,&vtk);
> VecView(solution,vtk);
> PetscViewerDestroy(&vtk);
>
> I put here an example of a vtk file that I have generated
>
> # vtk DataFile Version 2.0
> Simplicial Mesh Example
> ASCII
> DATASET UNSTRUCTURED_GRID
> POINTS 12 double
> 0.000000e+00 1.000000e+01 1.000000e+01
> 0.000000e+00 0.000000e+00 1.000000e+01
> 0.000000e+00 0.000000e+00 0.000000e+00
> 0.000000e+00 1.000000e+01 0.000000e+00
> 1.000000e+01 1.000000e+01 1.000000e+01
> 1.000000e+01 0.000000e+00 1.000000e+01
> 1.000000e+01 0.000000e+00 0.000000e+00
> 1.000000e+01 1.000000e+01 0.000000e+00
> 2.000000e+01 1.000000e+01 1.000000e+01
> 2.000000e+01 0.000000e+00 1.000000e+01
> 2.000000e+01 0.000000e+00 0.000000e+00
> 2.000000e+01 1.000000e+01 0.000000e+00
> CELLS 2 18
> 8  0 3 2 1 4 5 6 7
> 8  4 7 6 5 8 9 10 11
> CELL_TYPES 2
> 12
> 12
> POINT_DATA 12
> VECTORS dU_x double
> 2.754808e-10 -8.653846e-11 -8.653846e-11
> 2.754808e-10 8.653846e-11 -8.653846e-11
> 2.754808e-10 8.653846e-11 8.653846e-11
> 2.754808e-10 -8.653846e-11 8.653846e-11
> 4.678571e-01 -9.107143e-02 -9.107143e-02
> 4.678571e-01 9.107143e-02 -9.107143e-02
> 4.678571e-01 9.107143e-02 9.107143e-02
> 4.678571e-01 -9.107143e-02 9.107143e-02
> 1.000000e+00 -7.500000e-02 -7.500000e-02
> 1.000000e+00 7.500000e-02 -7.500000e-02
> 1.000000e+00 7.500000e-02 7.500000e-02
> 1.000000e+00 -7.500000e-02 7.500000e-02
>
>
> To obtain the good geometry, the two lines
>
> 8  0 3 2 1 4 5 6 7
> 8  4 7 6 5 8 9 10 11
>
>  Should be like this in order to have a good geometry defined in the gmsh
> file.
>
> 8  0 1 2 3 4 5 6 7
> 8  4 5 6 7 8 9 10 11
>
>
> - - - > So I m trying now to compile my code with petsc 3.16, may be it
> solves the problem of the rotation order of nodes.
>
> Thank you and have a good day,
>
> Sami,
>
> --
> Dr. Sami BEN ELHAJ SALAH
> Ing?nieur de Recherche (CNRS)
> Institut Pprime - ISAE - ENSMA
> Mobile: 06.62.51.26.74
> Email: sami.ben-elhaj-salah at ensma.fr <sami.ben-elhaj-salah at ensma.fr>
> www.samibenelhajsalah.com
> <https://samiben91.github.io/samibenelhajsalah/index.html>
>
>
>
> Le 8 juin 2022 ? 17:57, Matthew Knepley <knepley at gmail.com> a ?crit :
>
> On Wed, Jun 8, 2022 at 11:24 AM Sami BEN ELHAJ SALAH <
> sami.ben-elhaj-salah at ensma.fr> wrote:
>
>> Yes, the file "sami.vtu" is loaded correctly in paraview and I have the
>> good output like you.
>>
>> In my code, I tried with the same command given in your last answer and I
>> still have the wrong .vtu file.
>>
>
> Hi Sami,
>
> What do you mean by wrong?
>
> Can you just use the simple procedure:
>
>   PetscCall(DMCreate(comm, dm));
>   PetscCall(DMSetType(*dm, DMPLEX));
>   PetscCall(DMSetFromOptions(*dm));
>   PetscCall(DMViewFromOptions(*dm, NULL, "-dm_view"));
>
> This is the one that works for us. Then we can change it in your code one
> step at a time until you get what you need.
>
>   Thanks,
>
>     Matt
>
>
>> I use this:
>> mpirun -np 1 /home/benelhasa/fox_petsc/build_test/bin/Debug/FoXtroT
>> -snes_test_jacobian_view -snes_converged_reason -snes_monitor -ksp_monitor
>> -ksp_xmonitor -dm_plex_filename cub_2C3D8_msh.msh -dm_view
>> vtk:cub_2C3D8_msh.vtu cub_8C3D8.fxt
>>
>>
>> Thanks,
>> Sami,
>>
>> --
>> Dr. Sami BEN ELHAJ SALAH
>> Ing?nieur de Recherche (CNRS)
>> Institut Pprime - ISAE - ENSMA
>> Mobile: 06.62.51.26.74
>> Email: sami.ben-elhaj-salah at ensma.fr <sami.ben-elhaj-salah at ensma.fr>
>> www.samibenelhajsalah.com
>> <https://samiben91.github.io/samibenelhajsalah/index.html>
>>
>>
>>
>> Le 8 juin 2022 ? 16:25, Jed Brown <jed at jedbrown.org> a ?crit :
>>
>> Does the file load in paraview? When I load your *.msh in a tutorial with
>> -dm_plex_filename sami.msh -dm_view vtk:sami.vtu, I get this good output.
>>
>> <sami.vtu><sami.png>
>> Sami BEN ELHAJ SALAH <sami.ben-elhaj-salah at ensma.fr> writes:
>>
>> Hi Jed,
>>
>> Thank you for your answer.
>>
>> When I use a ??solution.vtu'', I obtain a wrong file.
>>
>> <?xml version="1.0"?>
>> <VTKFile type="UnstructuredGrid" version="0.1" byte_order="LittleEndian">
>> <UnstructuredGrid>
>> <Piece NumberOfPoints="12" NumberOfCells="2">
>> <Points>
>> <DataArray type="Float64" Name="Position" NumberOfComponents="3"
>> format="appended" offset="0" />
>> </Points>
>> <Cells>
>> <DataArray type="Int32" Name="connectivity" NumberOfComponents="1"
>> format="appended" offset="292" />
>> <DataArray type="Int32" Name="offsets" NumberOfComponents="1"
>> format="appended" offset="360" />
>> <DataArray type="UInt8" Name="types" NumberOfComponents="1"
>> format="appended" offset="372" />
>> </Cells>
>> <CellData>
>> <DataArray type="Int32" Name="Rank" NumberOfComponents="1"
>> format="appended" offset="378" />
>> </CellData>
>> <PointData>
>> <DataArray type="Float64" Name="dU_x(null)" NumberOfComponents="3"
>> format="appended" offset="390" />
>> </PointData>
>> </Piece>
>> </UnstructuredGrid>
>> <AppendedData encoding="raw">
>> _ $@$@$@$@$@$@$@$@$@$@$@$@4@$@$@4@$@4 at 4@$@@
>> ?p?O??=??sT?????sT????p?O??=??sT???=??sT????p?O??=??sT???=??sT???=?p?O??=??sT?????sT???=o
>> _????? uP???? uP??o _????? uP???? uP??o _????? uP???? uP??o _????? uP????
>> uP??b#???????333????333??_#??????
>> ?333????333??b#??????(?333??'?333??a#???????333??>?333??
>> </AppendedData>
>> </VTKFile>
>>
>>
>> If I understand your answer, to solve my problem, should just upgrade all
>> my software ?
>>
>> Thanks,
>> Sami,
>>
>>
>> --
>> Dr. Sami BEN ELHAJ SALAH
>> Ing?nieur de Recherche (CNRS)
>> Institut Pprime - ISAE - ENSMA
>> Mobile: 06.62.51.26.74
>> Email: sami.ben-elhaj-salah at ensma.fr <sami.ben-elhaj-salah at ensma.fr>
>> www.samibenelhajsalah.com <
>> https://samiben91.github.io/samibenelhajsalah/index.html>
>>
>>
>>
>> Le 8 juin 2022 ? 15:37, Jed Brown <jed at jedbrown.org> a ?crit :
>>
>> You're using pretty old versions of all software; I'd recommend
>> upgrading. I recommend choosing the file name "solution.vtu" to use the
>> modern (non-legacy) format. Does that work for you?
>>
>> Sami BEN ELHAJ SALAH <sami.ben-elhaj-salah at ensma.fr> writes:
>>
>> Dear Petsc Developer team,
>>
>> I solved a linear elastic problem in 3D using a DMPLEX. My system is
>> converging, then I would like to write out my solution vector to a vtk file
>> where I use unstructured mesh. Currently, I tried two algorithms and I have
>> the same result.
>>
>> 1) Algorithm 1
>> err = SNESSolve(_snes, bc_vec_test, solution);
>> CHKERRABORT(FOX::Parallel::COMM_WORLD,err);
>> PetscViewer vtk;
>>
>> PetscViewerVTKOpen(FOX::Parallel::COMM_WORLD,"solution.vtk",FILE_MODE_WRITE,&vtk);
>>
>> VecView(solution,vtk);
>> PetscViewerDestroy(&vtk);
>>
>>
>> 2) Algorithm 2
>> err = SNESSolve(_snes, bc_vec_test, solution);
>> CHKERRABORT(FOX::Parallel::COMM_WORLD,err);
>> PetscViewer vtk;
>> PetscViewerCreate(FOX::Parallel::COMM_WORLD, &vtk);
>> PetscViewerSetType(vtk, PETSCVIEWERVTK);
>> PetscViewerFileSetName(vtk, "sol.vtk");
>> VecView(solution, vtk);
>> PetscViewerDestroy(&vtk);
>>
>> The result seems correct except for the rotation order of the nodes (see
>> the red lines on gmsh and vtk file respectively). Then, I visualized my vtk
>> file with paraview, and I remarked that my geometry is not correct and not
>> conserved when comparing it with my gmsh file. So, I didn?t understand why
>> the rotation order of nodes is not conserved when saving my result to a vtk
>> file?
>>
>> Other information used:
>> - gmsh format 2.2
>> - Vtk version: 7.1.1
>> - Petsc version: 3.13/opt
>>
>> Below my two files gmsh and vtk:
>>
>> Gmsh file:
>> $MeshFormat
>> 2.2 0 8
>> $EndMeshFormat
>> $Nodes
>> 12
>> 1 0.0 10.0 10.0
>> 2 0.0 0.0 10.0
>> 3 0.0 0.0 0.0
>> 4 0.0 10.0 0.0
>> 5 10.0 10.0 10.0
>> 6 10.0 0.0 10.0
>> 7 10.0 0.0 0.0
>> 8 10.0 10.0 0.0
>> 9 20.0 10.0 10.0
>> 10 20.0 0.0 10.0
>> 11 20.0 0.0 0.0
>> 12 20.0 10.0 0.0
>> $EndNodes
>> $Elements
>> 2
>> 1 5 2 68 60 1 2 3 4 5 6 7 8
>> 2 5 2 68 60 5 6 7 8 9 10 11 12
>> $EndElements
>>
>> Vtk file :
>> # vtk DataFile Version 2.0
>> Simplicial Mesh Example
>> ASCII
>> DATASET UNSTRUCTURED_GRID
>> POINTS 12 double
>> 0.000000e+00 1.000000e+01 1.000000e+01
>> 0.000000e+00 0.000000e+00 1.000000e+01
>> 0.000000e+00 0.000000e+00 0.000000e+00
>> 0.000000e+00 1.000000e+01 0.000000e+00
>> 1.000000e+01 1.000000e+01 1.000000e+01
>> 1.000000e+01 0.000000e+00 1.000000e+01
>> 1.000000e+01 0.000000e+00 0.000000e+00
>> 1.000000e+01 1.000000e+01 0.000000e+00
>> 2.000000e+01 1.000000e+01 1.000000e+01
>> 2.000000e+01 0.000000e+00 1.000000e+01
>> 2.000000e+01 0.000000e+00 0.000000e+00
>> 2.000000e+01 1.000000e+01 0.000000e+00
>> CELLS 2 18
>> 8 0 3 2 1 4 5 6 7
>> 8 4 7 6 5 8 9 10 11
>> CELL_TYPES 2
>> 12
>> 12
>> POINT_DATA 12
>> VECTORS dU_x double
>> 2.754808e-10 -8.653846e-11 -8.653846e-11
>> 2.754808e-10 8.653846e-11 -8.653846e-11
>> 2.754808e-10 8.653846e-11 8.653846e-11
>> 2.754808e-10 -8.653846e-11 8.653846e-11
>> 4.678571e-01 -9.107143e-02 -9.107143e-02
>> 4.678571e-01 9.107143e-02 -9.107143e-02
>> 4.678571e-01 9.107143e-02 9.107143e-02
>> 4.678571e-01 -9.107143e-02 9.107143e-02
>> 1.000000e+00 -7.500000e-02 -7.500000e-02
>> 1.000000e+00 7.500000e-02 -7.500000e-02
>> 1.000000e+00 7.500000e-02 7.500000e-02
>> 1.000000e+00 -7.500000e-02 7.500000e-02
>>
>> Thank you in advance and have a good day !
>>
>> Sami,
>>
>> --
>> Dr. Sami BEN ELHAJ SALAH
>> Ing?nieur de Recherche (CNRS)
>> Institut Pprime - ISAE - ENSMA
>> Mobile: 06.62.51.26.74
>> Email: sami.ben-elhaj-salah at ensma.fr
>> www.samibenelhajsalah.com <
>> https://samiben91.github.io/samibenelhajsalah/index.html>
>>
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220613/99858d9f/attachment-0001.html>

From mail2amneet at gmail.com  Mon Jun 13 09:50:32 2022
From: mail2amneet at gmail.com (Amneet Bhalla)
Date: Mon, 13 Jun 2022 07:50:32 -0700
Subject: [petsc-users] Calling Pytorch and Python within PETSc C/C++ code
Message-ID: <CAMETWJ0ZuZf7KkwtGP6vb-xaALBpk7__f2NuUx8UBjxRvZfcbA@mail.gmail.com>

Hi Guys,

Is there a PETSc interface to make calls to Python scripts or libraries
(e.g., Pytorch) from a C/C++ code making use of PETSc? If so, are there
some examples that I refer to?

Thanks,
-- 
--Amneet
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220613/d642c103/attachment.html>

From tangqi at msu.edu  Mon Jun 13 11:20:17 2022
From: tangqi at msu.edu (Tang, Qi)
Date: Mon, 13 Jun 2022 16:20:17 +0000
Subject: [petsc-users] [EXTERNAL] Re: Question about SuperLU
In-Reply-To: <CAFvbobU=B3UUe8g-UnzLejrE82Gq6q-N4sX8wRMQseAD+824-Q@mail.gmail.com>
References: <ca06f6e208114e67bd7b45d1af1fcd22@lanl.gov>
	<D87220F6-DF2C-4375-B241-33423C14D185@petsc.dev>
	<f0a14efe1f974eea957cf8a63723b4cd@lanl.gov>
	<CAFvbobU=B3UUe8g-UnzLejrE82Gq6q-N4sX8wRMQseAD+824-Q@mail.gmail.com>
Message-ID: <4933ECB8-4977-4E1D-BD2A-87671C542A6F@msu.edu>

Sherry,

-mat_superlu_dist_replacetinypivot
This flag makes superlu_dist back to working for the full VB block as we would think. Thanks again for the suggestion.

Does this imply anything for the VB block matrix? Are we just unlucky or does that imply we have tiny terms along the diagonal and the matrix is not very good? (It could be the case since it is a stablized saddle point problem.) Again, we estimate the condition number through petsc and it is reasonable.

Qi


On Jun 10, 2022, at 6:35 PM, Xiaoye S. Li <xsli at lbl.gov> wrote:

-mat_superlu_dist_replacetinypivot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220613/f664b805/attachment.html>

From xsli at lbl.gov  Mon Jun 13 12:01:20 2022
From: xsli at lbl.gov (Xiaoye S. Li)
Date: Mon, 13 Jun 2022 19:01:20 +0200
Subject: [petsc-users] [EXTERNAL] Re: Question about SuperLU
In-Reply-To: <4933ECB8-4977-4E1D-BD2A-87671C542A6F@msu.edu>
References: <ca06f6e208114e67bd7b45d1af1fcd22@lanl.gov>
	<D87220F6-DF2C-4375-B241-33423C14D185@petsc.dev>
	<f0a14efe1f974eea957cf8a63723b4cd@lanl.gov>
	<CAFvbobU=B3UUe8g-UnzLejrE82Gq6q-N4sX8wRMQseAD+824-Q@mail.gmail.com>
	<4933ECB8-4977-4E1D-BD2A-87671C542A6F@msu.edu>
Message-ID: <CAFvbobX16bWJtXF2L9gtrBhsaZp1+XM0_x-ZeXwYCPa88BkwTA@mail.gmail.com>

Can you write down (in matrix notation) what does full VB matrix look like?

Sherry

On Mon, Jun 13, 2022 at 6:20 PM Tang, Qi <tangqi at msu.edu> wrote:

> Sherry,
>
> -mat_superlu_dist_replacetinypivot
> This flag makes superlu_dist back to working for the full VB block as we
> would think. Thanks again for the suggestion.
>
> Does this imply anything for the VB block matrix? Are we just unlucky or
> does that imply we have tiny terms along the diagonal and the matrix is not
> very good? (It could be the case since it is a stablized saddle point
> problem.) Again, we estimate the condition number through petsc and it is
> reasonable.
>
> Qi
>
>
> On Jun 10, 2022, at 6:35 PM, Xiaoye S. Li <xsli at lbl.gov> wrote:
>
> -mat_superlu_dist_replacetinypivot
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220613/07c10f0b/attachment.html>

From sami.ben-elhaj-salah at ensma.fr  Mon Jun 13 12:47:56 2022
From: sami.ben-elhaj-salah at ensma.fr (Sami BEN ELHAJ SALAH)
Date: Mon, 13 Jun 2022 19:47:56 +0200
Subject: [petsc-users] Writing VTK output
In-Reply-To: <CAMYG4GkQPfEpmjcwBiirLgE32zTG_GmLOkBB2ZoHJBxYVJSi4w@mail.gmail.com>
References: <D72073D3-62B5-4151-9B11-E3ED40F32B01@ensma.fr>
	<87czfje0ol.fsf@jedbrown.org>
	<CDC93C32-305A-42FD-881A-5548E2E1C532@ensma.fr>
	<875ylbdyfk.fsf@jedbrown.org>
	<7A2FB4C2-CA56-4D7D-9162-4574E14032C6@ensma.fr>
	<CAMYG4GkjWD8=FWub8kFOcQZnU8gKW6_udB8Nt=t8J+TyTjoeCw@mail.gmail.com>
	<EF26BAEA-EED3-4689-AACB-8DB0AC3D79C1@ensma.fr>
	<CAMYG4GkQPfEpmjcwBiirLgE32zTG_GmLOkBB2ZoHJBxYVJSi4w@mail.gmail.com>
Message-ID: <76639150-0C2A-4307-AE0A-A2A68E5C2A80@ensma.fr>

Hi Matthew,
Please find attached the gmsh file,
Thank you in advance !
Sami

--
Dr. Sami BEN ELHAJ SALAH
Ing?nieur de Recherche (CNRS)
Institut Pprime - ISAE - ENSMA
Mobile: 06.62.51.26.74
Email: sami.ben-elhaj-salah at ensma.fr
www.samibenelhajsalah.com <https://samiben91.github.io/samibenelhajsalah/index.html>



> Le 13 juin 2022 ? 15:18, Matthew Knepley <knepley at gmail.com> a ?crit :
> 
> Can you just send your GMsh file so I can see what you are asking for?
> 
> Also, Plex stores hexes with outward normals, but some other programs store them with some inward normals. This
> should be converted in the output. I can check this if you send your mesh.
> 
>   Thanks,
> 
>      Matt
> 
> On Sun, Jun 12, 2022 at 10:48 AM Sami BEN ELHAJ SALAH <sami.ben-elhaj-salah at ensma.fr <mailto:sami.ben-elhaj-salah at ensma.fr>> wrote:
> Dear Matthew and Jed,
> Thank you very much for explaining and your help. I am sorry for my late reply. 
> 
> For me, the .vtu file is wrong when the <AppendedData> section seems to be not correct (I mean the raw encoding because when I visualize the .vtu file on paraview, the geometry is not good). The header is OK (see attached file). To generate the vtu file, I use the routine suggested by Matthew and the commande line proposed by Jed (-dm_plex_filename 2C3D8_msh.msh -dm_view vtk:2C3D8_msh.vtu).
> 
> On the other hand,  when I use the routine below and write my output to a vtk file and not vtu, the result is ok except the rotation of the elements nodes (the nodes rotation is not good for me and not saved comparing to gmsh file). 
> 
> PetscViewer vtk;
> PetscViewerVTKOpen(FOX::Parallel::COMM_WORLD,"solution.vtk",FILE_MODE_WRITE,&vtk);
> VecView(solution,vtk);
> PetscViewerDestroy(&vtk);
> I put here an example of a vtk file that I have generated 
> # vtk DataFile Version 2.0
> Simplicial Mesh Example
> ASCII
> DATASET UNSTRUCTURED_GRID
> POINTS 12 double
> 0.000000e+00 1.000000e+01 1.000000e+01
> 0.000000e+00 0.000000e+00 1.000000e+01
> 0.000000e+00 0.000000e+00 0.000000e+00
> 0.000000e+00 1.000000e+01 0.000000e+00
> 1.000000e+01 1.000000e+01 1.000000e+01
> 1.000000e+01 0.000000e+00 1.000000e+01
> 1.000000e+01 0.000000e+00 0.000000e+00
> 1.000000e+01 1.000000e+01 0.000000e+00
> 2.000000e+01 1.000000e+01 1.000000e+01
> 2.000000e+01 0.000000e+00 1.000000e+01
> 2.000000e+01 0.000000e+00 0.000000e+00
> 2.000000e+01 1.000000e+01 0.000000e+00
> CELLS 2 18
> 8  0 3 2 1 4 5 6 7
> 8  4 7 6 5 8 9 10 11
> CELL_TYPES 2
> 12
> 12
> POINT_DATA 12
> VECTORS dU_x double
> 2.754808e-10 -8.653846e-11 -8.653846e-11
> 2.754808e-10 8.653846e-11 -8.653846e-11
> 2.754808e-10 8.653846e-11 8.653846e-11
> 2.754808e-10 -8.653846e-11 8.653846e-11
> 4.678571e-01 -9.107143e-02 -9.107143e-02
> 4.678571e-01 9.107143e-02 -9.107143e-02
> 4.678571e-01 9.107143e-02 9.107143e-02
> 4.678571e-01 -9.107143e-02 9.107143e-02
> 1.000000e+00 -7.500000e-02 -7.500000e-02
> 1.000000e+00 7.500000e-02 -7.500000e-02
> 1.000000e+00 7.500000e-02 7.500000e-02
> 1.000000e+00 -7.500000e-02 7.500000e-02
> 
> To obtain the good geometry, the two lines 
> 8  0 3 2 1 4 5 6 7
> 8  4 7 6 5 8 9 10 11
>  Should be like this in order to have a good geometry defined in the gmsh file.
> 8  0 1 2 3 4 5 6 7
> 8  4 5 6 7 8 9 10 11
> 
> - - - > So I m trying now to compile my code with petsc 3.16, may be it solves the problem of the rotation order of nodes.
> 
> Thank you and have a good day,
> 
> Sami,
> 
> --
> Dr. Sami BEN ELHAJ SALAH
> Ing?nieur de Recherche (CNRS)
> Institut Pprime - ISAE - ENSMA
> Mobile: 06.62.51.26.74
> Email: sami.ben-elhaj-salah at ensma.fr <mailto:sami.ben-elhaj-salah at ensma.fr>
> www.samibenelhajsalah.com <https://samiben91.github.io/samibenelhajsalah/index.html>
> 
> 
> 
>> Le 8 juin 2022 ? 17:57, Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>> a ?crit :
>> 
>> On Wed, Jun 8, 2022 at 11:24 AM Sami BEN ELHAJ SALAH <sami.ben-elhaj-salah at ensma.fr <mailto:sami.ben-elhaj-salah at ensma.fr>> wrote:
>> Yes, the file "sami.vtu" is loaded correctly in paraview and I have the good output like you.
>> 
>> In my code, I tried with the same command given in your last answer and I still have the wrong .vtu file.
>> 
>> Hi Sami,
>> 
>> What do you mean by wrong?
>> 
>> Can you just use the simple procedure:
>> 
>>   PetscCall(DMCreate(comm, dm));
>>   PetscCall(DMSetType(*dm, DMPLEX));
>>   PetscCall(DMSetFromOptions(*dm));
>>   PetscCall(DMViewFromOptions(*dm, NULL, "-dm_view"));
>> 
>> This is the one that works for us. Then we can change it in your code one step at a time until you get what you need.
>> 
>>   Thanks,
>> 
>>     Matt
>>  
>> I use this:
>> mpirun -np 1 /home/benelhasa/fox_petsc/build_test/bin/Debug/FoXtroT -snes_test_jacobian_view -snes_converged_reason -snes_monitor -ksp_monitor -ksp_xmonitor -dm_plex_filename cub_2C3D8_msh.msh -dm_view vtk:cub_2C3D8_msh.vtu cub_8C3D8.fxt
>> 
>> 
>> Thanks,
>> Sami,
>> 
>> --
>> Dr. Sami BEN ELHAJ SALAH
>> Ing?nieur de Recherche (CNRS)
>> Institut Pprime - ISAE - ENSMA
>> Mobile: 06.62.51.26.74
>> Email: sami.ben-elhaj-salah at ensma.fr <mailto:sami.ben-elhaj-salah at ensma.fr>
>> www.samibenelhajsalah.com <https://samiben91.github.io/samibenelhajsalah/index.html>
>> 
>> 
>> 
>>> Le 8 juin 2022 ? 16:25, Jed Brown <jed at jedbrown.org <mailto:jed at jedbrown.org>> a ?crit :
>>> 
>>> Does the file load in paraview? When I load your *.msh in a tutorial with -dm_plex_filename sami.msh -dm_view vtk:sami.vtu, I get this good output.
>>> 
>>> <sami.vtu><sami.png>
>>> Sami BEN ELHAJ SALAH <sami.ben-elhaj-salah at ensma.fr <mailto:sami.ben-elhaj-salah at ensma.fr>> writes:
>>> 
>>>> Hi Jed,
>>>> 
>>>> Thank you for your answer.
>>>> 
>>>> When I use a ??solution.vtu'', I obtain a wrong file. 
>>>> 
>>>> <?xml version="1.0"?>
>>>> <VTKFile type="UnstructuredGrid" version="0.1" byte_order="LittleEndian">
>>>> <UnstructuredGrid>
>>>> <Piece NumberOfPoints="12" NumberOfCells="2">
>>>> <Points>
>>>> <DataArray type="Float64" Name="Position" NumberOfComponents="3" format="appended" offset="0" />
>>>> </Points>
>>>> <Cells>
>>>> <DataArray type="Int32" Name="connectivity" NumberOfComponents="1" format="appended" offset="292" />
>>>> <DataArray type="Int32" Name="offsets" NumberOfComponents="1" format="appended" offset="360" />
>>>> <DataArray type="UInt8" Name="types" NumberOfComponents="1" format="appended" offset="372" />
>>>> </Cells>
>>>> <CellData>
>>>> <DataArray type="Int32" Name="Rank" NumberOfComponents="1" format="appended" offset="378" />
>>>> </CellData>
>>>> <PointData>
>>>> <DataArray type="Float64" Name="dU_x(null)" NumberOfComponents="3" format="appended" offset="390" />
>>>> </PointData>
>>>> </Piece>
>>>> </UnstructuredGrid>
>>>> <AppendedData encoding="raw">
>>>> _ $@$@$@$@$@$@$@$@$@$@$@$@4@$@$@4@$@4 at 4@$@@  	
>>>> ?p?O??=??sT?????sT????p?O??=??sT???=??sT????p?O??=??sT???=??sT???=?p?O??=??sT?????sT???=o _????? uP???? uP??o _????? uP???? uP??o _????? uP???? uP??o _????? uP???? uP??b#???????333????333??_#??????	?333????333??b#??????(?333??'?333??a#???????333??>?333??
>>>> </AppendedData>
>>>> </VTKFile>
>>>> 
>>>> 
>>>> If I understand your answer, to solve my problem, should just upgrade all my software ?
>>>> 
>>>> Thanks,
>>>> Sami,
>>>> 
>>>> 
>>>> --
>>>> Dr. Sami BEN ELHAJ SALAH
>>>> Ing?nieur de Recherche (CNRS)
>>>> Institut Pprime - ISAE - ENSMA
>>>> Mobile: 06.62.51.26.74
>>>> Email: sami.ben-elhaj-salah at ensma.fr <mailto:sami.ben-elhaj-salah at ensma.fr>
>>>> www.samibenelhajsalah.com <http://www.samibenelhajsalah.com/> <https://samiben91.github.io/samibenelhajsalah/index.html <https://samiben91.github.io/samibenelhajsalah/index.html>>
>>>> 
>>>> 
>>>> 
>>>>> Le 8 juin 2022 ? 15:37, Jed Brown <jed at jedbrown.org <mailto:jed at jedbrown.org>> a ?crit :
>>>>> 
>>>>> You're using pretty old versions of all software; I'd recommend upgrading. I recommend choosing the file name "solution.vtu" to use the modern (non-legacy) format. Does that work for you?
>>>>> 
>>>>> Sami BEN ELHAJ SALAH <sami.ben-elhaj-salah at ensma.fr <mailto:sami.ben-elhaj-salah at ensma.fr>> writes:
>>>>> 
>>>>>> Dear Petsc Developer team,
>>>>>> 
>>>>>> I solved a linear elastic problem in 3D using a DMPLEX. My system is converging, then I would like to write out my solution vector to a vtk file where I use unstructured mesh. Currently, I tried two algorithms and I have the same result.
>>>>>> 
>>>>>> 1) Algorithm 1 
>>>>>> err = SNESSolve(_snes, bc_vec_test, solution);
>>>>>> CHKERRABORT(FOX::Parallel::COMM_WORLD,err);
>>>>>> PetscViewer vtk; 
>>>>>> PetscViewerVTKOpen(FOX::Parallel::COMM_WORLD,"solution.vtk",FILE_MODE_WRITE,&vtk); 
>>>>>> VecView(solution,vtk);
>>>>>> PetscViewerDestroy(&vtk);
>>>>>> 
>>>>>> 
>>>>>> 2) Algorithm 2
>>>>>> err = SNESSolve(_snes, bc_vec_test, solution);
>>>>>> CHKERRABORT(FOX::Parallel::COMM_WORLD,err);
>>>>>> PetscViewer vtk; 
>>>>>> PetscViewerCreate(FOX::Parallel::COMM_WORLD, &vtk); 
>>>>>> PetscViewerSetType(vtk, PETSCVIEWERVTK); 
>>>>>> PetscViewerFileSetName(vtk, "sol.vtk"); 
>>>>>> VecView(solution, vtk); 
>>>>>> PetscViewerDestroy(&vtk);
>>>>>> 
>>>>>> The result seems correct except for the rotation order of the nodes (see the red lines on gmsh and vtk file respectively). Then, I visualized my vtk file with paraview, and I remarked that my geometry is not correct and not conserved when comparing it with my gmsh file. So, I didn?t understand why the rotation order of nodes is not conserved when saving my result to a vtk file?
>>>>>> 
>>>>>> Other information used:
>>>>>> - gmsh format 2.2 
>>>>>> - Vtk version: 7.1.1
>>>>>> - Petsc version: 3.13/opt 
>>>>>> 
>>>>>> Below my two files gmsh and vtk:
>>>>>> 
>>>>>> Gmsh file:
>>>>>> $MeshFormat
>>>>>> 2.2 0 8
>>>>>> $EndMeshFormat
>>>>>> $Nodes
>>>>>> 12
>>>>>> 1 0.0 10.0 10.0
>>>>>> 2 0.0 0.0 10.0
>>>>>> 3 0.0 0.0 0.0
>>>>>> 4 0.0 10.0 0.0
>>>>>> 5 10.0 10.0 10.0
>>>>>> 6 10.0 0.0 10.0
>>>>>> 7 10.0 0.0 0.0
>>>>>> 8 10.0 10.0 0.0
>>>>>> 9 20.0 10.0 10.0
>>>>>> 10 20.0 0.0 10.0
>>>>>> 11 20.0 0.0 0.0
>>>>>> 12 20.0 10.0 0.0
>>>>>> $EndNodes
>>>>>> $Elements
>>>>>> 2
>>>>>> 1 5 2 68 60 1 2 3 4 5 6 7 8
>>>>>> 2 5 2 68 60 5 6 7 8 9 10 11 12
>>>>>> $EndElements
>>>>>> 
>>>>>> Vtk file :
>>>>>> # vtk DataFile Version 2.0
>>>>>> Simplicial Mesh Example
>>>>>> ASCII
>>>>>> DATASET UNSTRUCTURED_GRID
>>>>>> POINTS 12 double
>>>>>> 0.000000e+00 1.000000e+01 1.000000e+01
>>>>>> 0.000000e+00 0.000000e+00 1.000000e+01
>>>>>> 0.000000e+00 0.000000e+00 0.000000e+00
>>>>>> 0.000000e+00 1.000000e+01 0.000000e+00
>>>>>> 1.000000e+01 1.000000e+01 1.000000e+01
>>>>>> 1.000000e+01 0.000000e+00 1.000000e+01
>>>>>> 1.000000e+01 0.000000e+00 0.000000e+00
>>>>>> 1.000000e+01 1.000000e+01 0.000000e+00
>>>>>> 2.000000e+01 1.000000e+01 1.000000e+01
>>>>>> 2.000000e+01 0.000000e+00 1.000000e+01
>>>>>> 2.000000e+01 0.000000e+00 0.000000e+00
>>>>>> 2.000000e+01 1.000000e+01 0.000000e+00
>>>>>> CELLS 2 18
>>>>>> 8 0 3 2 1 4 5 6 7
>>>>>> 8 4 7 6 5 8 9 10 11
>>>>>> CELL_TYPES 2
>>>>>> 12
>>>>>> 12
>>>>>> POINT_DATA 12
>>>>>> VECTORS dU_x double
>>>>>> 2.754808e-10 -8.653846e-11 -8.653846e-11
>>>>>> 2.754808e-10 8.653846e-11 -8.653846e-11
>>>>>> 2.754808e-10 8.653846e-11 8.653846e-11
>>>>>> 2.754808e-10 -8.653846e-11 8.653846e-11
>>>>>> 4.678571e-01 -9.107143e-02 -9.107143e-02
>>>>>> 4.678571e-01 9.107143e-02 -9.107143e-02
>>>>>> 4.678571e-01 9.107143e-02 9.107143e-02
>>>>>> 4.678571e-01 -9.107143e-02 9.107143e-02
>>>>>> 1.000000e+00 -7.500000e-02 -7.500000e-02
>>>>>> 1.000000e+00 7.500000e-02 -7.500000e-02
>>>>>> 1.000000e+00 7.500000e-02 7.500000e-02
>>>>>> 1.000000e+00 -7.500000e-02 7.500000e-02
>>>>>> 
>>>>>> Thank you in advance and have a good day !
>>>>>> 
>>>>>> Sami,
>>>>>> 
>>>>>> --
>>>>>> Dr. Sami BEN ELHAJ SALAH
>>>>>> Ing?nieur de Recherche (CNRS)
>>>>>> Institut Pprime - ISAE - ENSMA
>>>>>> Mobile: 06.62.51.26.74
>>>>>> Email: sami.ben-elhaj-salah at ensma.fr <mailto:sami.ben-elhaj-salah at ensma.fr>
>>>>>> www.samibenelhajsalah.com <http://www.samibenelhajsalah.com/> <https://samiben91.github.io/samibenelhajsalah/index.html <https://samiben91.github.io/samibenelhajsalah/index.html>>
>> 
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
>> 
>> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220613/be6959c6/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cub_2C3D8_msh.msh
Type: application/octet-stream
Size: 343 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220613/be6959c6/attachment-0001.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220613/be6959c6/attachment-0003.html>

From jed at jedbrown.org  Mon Jun 13 12:58:23 2022
From: jed at jedbrown.org (Jed Brown)
Date: Mon, 13 Jun 2022 11:58:23 -0600
Subject: [petsc-users] Writing VTK output
In-Reply-To: <76639150-0C2A-4307-AE0A-A2A68E5C2A80@ensma.fr>
References: <D72073D3-62B5-4151-9B11-E3ED40F32B01@ensma.fr>
	<87czfje0ol.fsf@jedbrown.org>
	<CDC93C32-305A-42FD-881A-5548E2E1C532@ensma.fr>
	<875ylbdyfk.fsf@jedbrown.org>
	<7A2FB4C2-CA56-4D7D-9162-4574E14032C6@ensma.fr>
	<CAMYG4GkjWD8=FWub8kFOcQZnU8gKW6_udB8Nt=t8J+TyTjoeCw@mail.gmail.com>
	<EF26BAEA-EED3-4689-AACB-8DB0AC3D79C1@ensma.fr>
	<CAMYG4GkQPfEpmjcwBiirLgE32zTG_GmLOkBB2ZoHJBxYVJSi4w@mail.gmail.com>
	<76639150-0C2A-4307-AE0A-A2A68E5C2A80@ensma.fr>
Message-ID: <87a6agfnsw.fsf@jedbrown.org>

This file is corrupted. It ends with

  $Elements
  2
  1 5 2 68 60 1 2 3 4 5 6 7 8
  2 5 2 68 60 5 6 7 8 9 10 11 12
  $EndElements//+
  Show "*";

That should be

  $Elements
  2
  1 5 2 68 60 1 2 3 4 5 6 7 8
  2 5 2 68 60 5 6 7 8 9 10 11 12
  $EndElements

If you fix it, then you can run

$ make $PETSC_ARCH/tests/dm/impls/plex/tutorials/ex7
$ $PETSC_ARCH/tests/dm/impls/plex/tutorials/ex7 -dm_plex_filename ~/dl/cub_2C3D8_msh.msh -dm_view vtk:foo.vtu

and open foo.vtu in Paraview. It looks correct.

Sami BEN ELHAJ SALAH <sami.ben-elhaj-salah at ensma.fr> writes:

> Hi Matthew,
> Please find attached the gmsh file,
> Thank you in advance !
> Sami
>
> --
> Dr. Sami BEN ELHAJ SALAH
> Ing?nieur de Recherche (CNRS)
> Institut Pprime - ISAE - ENSMA
> Mobile: 06.62.51.26.74
> Email: sami.ben-elhaj-salah at ensma.fr
> www.samibenelhajsalah.com <https://samiben91.github.io/samibenelhajsalah/index.html>
>
>
>
>> Le 13 juin 2022 ? 15:18, Matthew Knepley <knepley at gmail.com> a ?crit :
>> 
>> Can you just send your GMsh file so I can see what you are asking for?
>> 
>> Also, Plex stores hexes with outward normals, but some other programs store them with some inward normals. This
>> should be converted in the output. I can check this if you send your mesh.
>> 
>>   Thanks,
>> 
>>      Matt
>> 
>> On Sun, Jun 12, 2022 at 10:48 AM Sami BEN ELHAJ SALAH <sami.ben-elhaj-salah at ensma.fr <mailto:sami.ben-elhaj-salah at ensma.fr>> wrote:
>> Dear Matthew and Jed,
>> Thank you very much for explaining and your help. I am sorry for my late reply. 
>> 
>> For me, the .vtu file is wrong when the <AppendedData> section seems to be not correct (I mean the raw encoding because when I visualize the .vtu file on paraview, the geometry is not good). The header is OK (see attached file). To generate the vtu file, I use the routine suggested by Matthew and the commande line proposed by Jed (-dm_plex_filename 2C3D8_msh.msh -dm_view vtk:2C3D8_msh.vtu).
>> 
>> On the other hand,  when I use the routine below and write my output to a vtk file and not vtu, the result is ok except the rotation of the elements nodes (the nodes rotation is not good for me and not saved comparing to gmsh file). 
>> 
>> PetscViewer vtk;
>> PetscViewerVTKOpen(FOX::Parallel::COMM_WORLD,"solution.vtk",FILE_MODE_WRITE,&vtk);
>> VecView(solution,vtk);
>> PetscViewerDestroy(&vtk);
>> I put here an example of a vtk file that I have generated 
>> # vtk DataFile Version 2.0
>> Simplicial Mesh Example
>> ASCII
>> DATASET UNSTRUCTURED_GRID
>> POINTS 12 double
>> 0.000000e+00 1.000000e+01 1.000000e+01
>> 0.000000e+00 0.000000e+00 1.000000e+01
>> 0.000000e+00 0.000000e+00 0.000000e+00
>> 0.000000e+00 1.000000e+01 0.000000e+00
>> 1.000000e+01 1.000000e+01 1.000000e+01
>> 1.000000e+01 0.000000e+00 1.000000e+01
>> 1.000000e+01 0.000000e+00 0.000000e+00
>> 1.000000e+01 1.000000e+01 0.000000e+00
>> 2.000000e+01 1.000000e+01 1.000000e+01
>> 2.000000e+01 0.000000e+00 1.000000e+01
>> 2.000000e+01 0.000000e+00 0.000000e+00
>> 2.000000e+01 1.000000e+01 0.000000e+00
>> CELLS 2 18
>> 8  0 3 2 1 4 5 6 7
>> 8  4 7 6 5 8 9 10 11
>> CELL_TYPES 2
>> 12
>> 12
>> POINT_DATA 12
>> VECTORS dU_x double
>> 2.754808e-10 -8.653846e-11 -8.653846e-11
>> 2.754808e-10 8.653846e-11 -8.653846e-11
>> 2.754808e-10 8.653846e-11 8.653846e-11
>> 2.754808e-10 -8.653846e-11 8.653846e-11
>> 4.678571e-01 -9.107143e-02 -9.107143e-02
>> 4.678571e-01 9.107143e-02 -9.107143e-02
>> 4.678571e-01 9.107143e-02 9.107143e-02
>> 4.678571e-01 -9.107143e-02 9.107143e-02
>> 1.000000e+00 -7.500000e-02 -7.500000e-02
>> 1.000000e+00 7.500000e-02 -7.500000e-02
>> 1.000000e+00 7.500000e-02 7.500000e-02
>> 1.000000e+00 -7.500000e-02 7.500000e-02
>> 
>> To obtain the good geometry, the two lines 
>> 8  0 3 2 1 4 5 6 7
>> 8  4 7 6 5 8 9 10 11
>>  Should be like this in order to have a good geometry defined in the gmsh file.
>> 8  0 1 2 3 4 5 6 7
>> 8  4 5 6 7 8 9 10 11
>> 
>> - - - > So I m trying now to compile my code with petsc 3.16, may be it solves the problem of the rotation order of nodes.
>> 
>> Thank you and have a good day,
>> 
>> Sami,
>> 
>> --
>> Dr. Sami BEN ELHAJ SALAH
>> Ing?nieur de Recherche (CNRS)
>> Institut Pprime - ISAE - ENSMA
>> Mobile: 06.62.51.26.74
>> Email: sami.ben-elhaj-salah at ensma.fr <mailto:sami.ben-elhaj-salah at ensma.fr>
>> www.samibenelhajsalah.com <https://samiben91.github.io/samibenelhajsalah/index.html>
>> 
>> 
>> 
>>> Le 8 juin 2022 ? 17:57, Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>> a ?crit :
>>> 
>>> On Wed, Jun 8, 2022 at 11:24 AM Sami BEN ELHAJ SALAH <sami.ben-elhaj-salah at ensma.fr <mailto:sami.ben-elhaj-salah at ensma.fr>> wrote:
>>> Yes, the file "sami.vtu" is loaded correctly in paraview and I have the good output like you.
>>> 
>>> In my code, I tried with the same command given in your last answer and I still have the wrong .vtu file.
>>> 
>>> Hi Sami,
>>> 
>>> What do you mean by wrong?
>>> 
>>> Can you just use the simple procedure:
>>> 
>>>   PetscCall(DMCreate(comm, dm));
>>>   PetscCall(DMSetType(*dm, DMPLEX));
>>>   PetscCall(DMSetFromOptions(*dm));
>>>   PetscCall(DMViewFromOptions(*dm, NULL, "-dm_view"));
>>> 
>>> This is the one that works for us. Then we can change it in your code one step at a time until you get what you need.
>>> 
>>>   Thanks,
>>> 
>>>     Matt
>>>  
>>> I use this:
>>> mpirun -np 1 /home/benelhasa/fox_petsc/build_test/bin/Debug/FoXtroT -snes_test_jacobian_view -snes_converged_reason -snes_monitor -ksp_monitor -ksp_xmonitor -dm_plex_filename cub_2C3D8_msh.msh -dm_view vtk:cub_2C3D8_msh.vtu cub_8C3D8.fxt
>>> 
>>> 
>>> Thanks,
>>> Sami,
>>> 
>>> --
>>> Dr. Sami BEN ELHAJ SALAH
>>> Ing?nieur de Recherche (CNRS)
>>> Institut Pprime - ISAE - ENSMA
>>> Mobile: 06.62.51.26.74
>>> Email: sami.ben-elhaj-salah at ensma.fr <mailto:sami.ben-elhaj-salah at ensma.fr>
>>> www.samibenelhajsalah.com <https://samiben91.github.io/samibenelhajsalah/index.html>
>>> 
>>> 
>>> 
>>>> Le 8 juin 2022 ? 16:25, Jed Brown <jed at jedbrown.org <mailto:jed at jedbrown.org>> a ?crit :
>>>> 
>>>> Does the file load in paraview? When I load your *.msh in a tutorial with -dm_plex_filename sami.msh -dm_view vtk:sami.vtu, I get this good output.
>>>> 
>>>> <sami.vtu><sami.png>
>>>> Sami BEN ELHAJ SALAH <sami.ben-elhaj-salah at ensma.fr <mailto:sami.ben-elhaj-salah at ensma.fr>> writes:
>>>> 
>>>>> Hi Jed,
>>>>> 
>>>>> Thank you for your answer.
>>>>> 
>>>>> When I use a ??solution.vtu'', I obtain a wrong file. 
>>>>> 
>>>>> <?xml version="1.0"?>
>>>>> <VTKFile type="UnstructuredGrid" version="0.1" byte_order="LittleEndian">
>>>>> <UnstructuredGrid>
>>>>> <Piece NumberOfPoints="12" NumberOfCells="2">
>>>>> <Points>
>>>>> <DataArray type="Float64" Name="Position" NumberOfComponents="3" format="appended" offset="0" />
>>>>> </Points>
>>>>> <Cells>
>>>>> <DataArray type="Int32" Name="connectivity" NumberOfComponents="1" format="appended" offset="292" />
>>>>> <DataArray type="Int32" Name="offsets" NumberOfComponents="1" format="appended" offset="360" />
>>>>> <DataArray type="UInt8" Name="types" NumberOfComponents="1" format="appended" offset="372" />
>>>>> </Cells>
>>>>> <CellData>
>>>>> <DataArray type="Int32" Name="Rank" NumberOfComponents="1" format="appended" offset="378" />
>>>>> </CellData>
>>>>> <PointData>
>>>>> <DataArray type="Float64" Name="dU_x(null)" NumberOfComponents="3" format="appended" offset="390" />
>>>>> </PointData>
>>>>> </Piece>
>>>>> </UnstructuredGrid>
>>>>> <AppendedData encoding="raw">
>>>>> _ $@$@$@$@$@$@$@$@$@$@$@$@4@$@$@4@$@4 at 4@$@@  	
>>>>> ?p?O??=??sT?????sT????p?O??=??sT???=??sT????p?O??=??sT???=??sT???=?p?O??=??sT?????sT???=o _????? uP???? uP??o _????? uP???? uP??o _????? uP???? uP??o _????? uP???? uP??b#???????333????333??_#??????	?333????333??b#??????(?333??'?333??a#???????333??>?333??
>>>>> </AppendedData>
>>>>> </VTKFile>
>>>>> 
>>>>> 
>>>>> If I understand your answer, to solve my problem, should just upgrade all my software ?
>>>>> 
>>>>> Thanks,
>>>>> Sami,
>>>>> 
>>>>> 
>>>>> --
>>>>> Dr. Sami BEN ELHAJ SALAH
>>>>> Ing?nieur de Recherche (CNRS)
>>>>> Institut Pprime - ISAE - ENSMA
>>>>> Mobile: 06.62.51.26.74
>>>>> Email: sami.ben-elhaj-salah at ensma.fr <mailto:sami.ben-elhaj-salah at ensma.fr>
>>>>> www.samibenelhajsalah.com <http://www.samibenelhajsalah.com/> <https://samiben91.github.io/samibenelhajsalah/index.html <https://samiben91.github.io/samibenelhajsalah/index.html>>
>>>>> 
>>>>> 
>>>>> 
>>>>>> Le 8 juin 2022 ? 15:37, Jed Brown <jed at jedbrown.org <mailto:jed at jedbrown.org>> a ?crit :
>>>>>> 
>>>>>> You're using pretty old versions of all software; I'd recommend upgrading. I recommend choosing the file name "solution.vtu" to use the modern (non-legacy) format. Does that work for you?
>>>>>> 
>>>>>> Sami BEN ELHAJ SALAH <sami.ben-elhaj-salah at ensma.fr <mailto:sami.ben-elhaj-salah at ensma.fr>> writes:
>>>>>> 
>>>>>>> Dear Petsc Developer team,
>>>>>>> 
>>>>>>> I solved a linear elastic problem in 3D using a DMPLEX. My system is converging, then I would like to write out my solution vector to a vtk file where I use unstructured mesh. Currently, I tried two algorithms and I have the same result.
>>>>>>> 
>>>>>>> 1) Algorithm 1 
>>>>>>> err = SNESSolve(_snes, bc_vec_test, solution);
>>>>>>> CHKERRABORT(FOX::Parallel::COMM_WORLD,err);
>>>>>>> PetscViewer vtk; 
>>>>>>> PetscViewerVTKOpen(FOX::Parallel::COMM_WORLD,"solution.vtk",FILE_MODE_WRITE,&vtk); 
>>>>>>> VecView(solution,vtk);
>>>>>>> PetscViewerDestroy(&vtk);
>>>>>>> 
>>>>>>> 
>>>>>>> 2) Algorithm 2
>>>>>>> err = SNESSolve(_snes, bc_vec_test, solution);
>>>>>>> CHKERRABORT(FOX::Parallel::COMM_WORLD,err);
>>>>>>> PetscViewer vtk; 
>>>>>>> PetscViewerCreate(FOX::Parallel::COMM_WORLD, &vtk); 
>>>>>>> PetscViewerSetType(vtk, PETSCVIEWERVTK); 
>>>>>>> PetscViewerFileSetName(vtk, "sol.vtk"); 
>>>>>>> VecView(solution, vtk); 
>>>>>>> PetscViewerDestroy(&vtk);
>>>>>>> 
>>>>>>> The result seems correct except for the rotation order of the nodes (see the red lines on gmsh and vtk file respectively). Then, I visualized my vtk file with paraview, and I remarked that my geometry is not correct and not conserved when comparing it with my gmsh file. So, I didn?t understand why the rotation order of nodes is not conserved when saving my result to a vtk file?
>>>>>>> 
>>>>>>> Other information used:
>>>>>>> - gmsh format 2.2 
>>>>>>> - Vtk version: 7.1.1
>>>>>>> - Petsc version: 3.13/opt 
>>>>>>> 
>>>>>>> Below my two files gmsh and vtk:
>>>>>>> 
>>>>>>> Gmsh file:
>>>>>>> $MeshFormat
>>>>>>> 2.2 0 8
>>>>>>> $EndMeshFormat
>>>>>>> $Nodes
>>>>>>> 12
>>>>>>> 1 0.0 10.0 10.0
>>>>>>> 2 0.0 0.0 10.0
>>>>>>> 3 0.0 0.0 0.0
>>>>>>> 4 0.0 10.0 0.0
>>>>>>> 5 10.0 10.0 10.0
>>>>>>> 6 10.0 0.0 10.0
>>>>>>> 7 10.0 0.0 0.0
>>>>>>> 8 10.0 10.0 0.0
>>>>>>> 9 20.0 10.0 10.0
>>>>>>> 10 20.0 0.0 10.0
>>>>>>> 11 20.0 0.0 0.0
>>>>>>> 12 20.0 10.0 0.0
>>>>>>> $EndNodes
>>>>>>> $Elements
>>>>>>> 2
>>>>>>> 1 5 2 68 60 1 2 3 4 5 6 7 8
>>>>>>> 2 5 2 68 60 5 6 7 8 9 10 11 12
>>>>>>> $EndElements
>>>>>>> 
>>>>>>> Vtk file :
>>>>>>> # vtk DataFile Version 2.0
>>>>>>> Simplicial Mesh Example
>>>>>>> ASCII
>>>>>>> DATASET UNSTRUCTURED_GRID
>>>>>>> POINTS 12 double
>>>>>>> 0.000000e+00 1.000000e+01 1.000000e+01
>>>>>>> 0.000000e+00 0.000000e+00 1.000000e+01
>>>>>>> 0.000000e+00 0.000000e+00 0.000000e+00
>>>>>>> 0.000000e+00 1.000000e+01 0.000000e+00
>>>>>>> 1.000000e+01 1.000000e+01 1.000000e+01
>>>>>>> 1.000000e+01 0.000000e+00 1.000000e+01
>>>>>>> 1.000000e+01 0.000000e+00 0.000000e+00
>>>>>>> 1.000000e+01 1.000000e+01 0.000000e+00
>>>>>>> 2.000000e+01 1.000000e+01 1.000000e+01
>>>>>>> 2.000000e+01 0.000000e+00 1.000000e+01
>>>>>>> 2.000000e+01 0.000000e+00 0.000000e+00
>>>>>>> 2.000000e+01 1.000000e+01 0.000000e+00
>>>>>>> CELLS 2 18
>>>>>>> 8 0 3 2 1 4 5 6 7
>>>>>>> 8 4 7 6 5 8 9 10 11
>>>>>>> CELL_TYPES 2
>>>>>>> 12
>>>>>>> 12
>>>>>>> POINT_DATA 12
>>>>>>> VECTORS dU_x double
>>>>>>> 2.754808e-10 -8.653846e-11 -8.653846e-11
>>>>>>> 2.754808e-10 8.653846e-11 -8.653846e-11
>>>>>>> 2.754808e-10 8.653846e-11 8.653846e-11
>>>>>>> 2.754808e-10 -8.653846e-11 8.653846e-11
>>>>>>> 4.678571e-01 -9.107143e-02 -9.107143e-02
>>>>>>> 4.678571e-01 9.107143e-02 -9.107143e-02
>>>>>>> 4.678571e-01 9.107143e-02 9.107143e-02
>>>>>>> 4.678571e-01 -9.107143e-02 9.107143e-02
>>>>>>> 1.000000e+00 -7.500000e-02 -7.500000e-02
>>>>>>> 1.000000e+00 7.500000e-02 -7.500000e-02
>>>>>>> 1.000000e+00 7.500000e-02 7.500000e-02
>>>>>>> 1.000000e+00 -7.500000e-02 7.500000e-02
>>>>>>> 
>>>>>>> Thank you in advance and have a good day !
>>>>>>> 
>>>>>>> Sami,
>>>>>>> 
>>>>>>> --
>>>>>>> Dr. Sami BEN ELHAJ SALAH
>>>>>>> Ing?nieur de Recherche (CNRS)
>>>>>>> Institut Pprime - ISAE - ENSMA
>>>>>>> Mobile: 06.62.51.26.74
>>>>>>> Email: sami.ben-elhaj-salah at ensma.fr <mailto:sami.ben-elhaj-salah at ensma.fr>
>>>>>>> www.samibenelhajsalah.com <http://www.samibenelhajsalah.com/> <https://samiben91.github.io/samibenelhajsalah/index.html <https://samiben91.github.io/samibenelhajsalah/index.html>>
>>> 
>>> 
>>> 
>>> -- 
>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>> -- Norbert Wiener
>>> 
>>> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
>> 
>> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>

From hongzhang at anl.gov  Mon Jun 13 15:42:13 2022
From: hongzhang at anl.gov (Zhang, Hong)
Date: Mon, 13 Jun 2022 20:42:13 +0000
Subject: [petsc-users] Calling Pytorch and Python within PETSc C/C++ code
In-Reply-To: <CAMETWJ0ZuZf7KkwtGP6vb-xaALBpk7__f2NuUx8UBjxRvZfcbA@mail.gmail.com>
References: <CAMETWJ0ZuZf7KkwtGP6vb-xaALBpk7__f2NuUx8UBjxRvZfcbA@mail.gmail.com>
Message-ID: <D4370937-65B3-4BEB-B271-0A5E05C636EE@anl.gov>

No. It is not common to execute Python scripts or libraries from C/C++ code. If you are looking for ways to use PETSc and PyTorch together, it is best to build your application in Python so that you can use both petsc4py and PyTorch. See the following code for an example:
https://github.com/caidao22/pnode

Hong (Mr.)

On Jun 13, 2022, at 9:50 AM, Amneet Bhalla <mail2amneet at gmail.com<mailto:mail2amneet at gmail.com>> wrote:


Hi Guys,

Is there a PETSc interface to make calls to Python scripts or libraries (e.g., Pytorch) from a C/C++ code making use of PETSc? If so, are there some examples that I refer to?

Thanks,
--
--Amneet




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220613/054a6bf8/attachment.html>

From bsmith at petsc.dev  Mon Jun 13 16:11:28 2022
From: bsmith at petsc.dev (Barry Smith)
Date: Mon, 13 Jun 2022 17:11:28 -0400
Subject: [petsc-users] Calling Pytorch and Python within PETSc C/C++ code
In-Reply-To: <D4370937-65B3-4BEB-B271-0A5E05C636EE@anl.gov>
References: <CAMETWJ0ZuZf7KkwtGP6vb-xaALBpk7__f2NuUx8UBjxRvZfcbA@mail.gmail.com>
	<D4370937-65B3-4BEB-B271-0A5E05C636EE@anl.gov>
Message-ID: <9AFC3AE5-E969-4755-911F-E7731C210E98@petsc.dev>


  Note that your Python main ptsc4py program can call C/C++ code for some of its computations, so if you have a lot of C/C++ code you do not need to change it all to Python. It is also possible to call Petsc4py (and hence PyTorch) from a C/C++ main but a bit more cumbersome so not recommended. See src/ksp/ksp/tutorials/ex100.c and ex100.py for an example of a C/C++ main that uses petsc4py (in a limited way). 

> On Jun 13, 2022, at 4:42 PM, Zhang, Hong via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> No. It is not common to execute Python scripts or libraries from C/C++ code. If you are looking for ways to use PETSc and PyTorch together, it is best to build your application in Python so that you can use both petsc4py and PyTorch. See the following code for an example:
> https://github.com/caidao22/pnode <https://github.com/caidao22/pnode>
> 
> Hong (Mr.)
> 
>> On Jun 13, 2022, at 9:50 AM, Amneet Bhalla <mail2amneet at gmail.com <mailto:mail2amneet at gmail.com>> wrote:
>> 
>> 
>> Hi Guys,
>> 
>> Is there a PETSc interface to make calls to Python scripts or libraries (e.g., Pytorch) from a C/C++ code making use of PETSc? If so, are there some examples that I refer to?
>> 
>> Thanks, 
>> -- 
>> --Amneet 
>> 
>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220613/4c2193a5/attachment-0001.html>

From mail2amneet at gmail.com  Mon Jun 13 21:46:14 2022
From: mail2amneet at gmail.com (Amneet Bhalla)
Date: Mon, 13 Jun 2022 19:46:14 -0700
Subject: [petsc-users] Calling Pytorch and Python within PETSc C/C++ code
In-Reply-To: <9AFC3AE5-E969-4755-911F-E7731C210E98@petsc.dev>
References: <CAMETWJ0ZuZf7KkwtGP6vb-xaALBpk7__f2NuUx8UBjxRvZfcbA@mail.gmail.com>
	<D4370937-65B3-4BEB-B271-0A5E05C636EE@anl.gov>
	<9AFC3AE5-E969-4755-911F-E7731C210E98@petsc.dev>
Message-ID: <CAMETWJ0fEmLu+Ju01A715_H73pNNJYRz6jQZD=EdObCWCvB=vA@mail.gmail.com>

Thanks for the information. We will check these examples out. Basically we
have some trained ANNs that will provide few scalars to the C++ based CFD
code. We won?t envision doing too much data transfer between C++ and
Python/PyTorch.

On Mon, Jun 13, 2022 at 2:11 PM Barry Smith <bsmith at petsc.dev> wrote:

>
>   Note that your Python main ptsc4py program can call C/C++ code for some
> of its computations, so if you have a lot of C/C++ code you do not need to
> change it all to Python. It is also possible to call Petsc4py (and hence
> PyTorch) from a C/C++ main but a bit more cumbersome so not recommended.
> See src/ksp/ksp/tutorials/ex100.c and ex100.py for an example of a C/C++
> main that uses petsc4py (in a limited way).
>
>
> On Jun 13, 2022, at 4:42 PM, Zhang, Hong via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
> No. It is not common to execute Python scripts or libraries from C/C++
> code. If you are looking for ways to use PETSc and PyTorch together, it is
> best to build your application in Python so that you can use both petsc4py
> and PyTorch. See the following code for an example:
> https://github.com/caidao22/pnode
>
> Hong (Mr.)
>
> On Jun 13, 2022, at 9:50 AM, Amneet Bhalla <mail2amneet at gmail.com> wrote:
>
>
> Hi Guys,
>
> Is there a PETSc interface to make calls to Python scripts or libraries
> (e.g., Pytorch) from a C/C++ code making use of PETSc? If so, are there
> some examples that I refer to?
>
> Thanks,
> --
> --Amneet
>
>
>
>
>
> --
--Amneet
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220613/19720712/attachment.html>

From jsfaraway at gmail.com  Wed Jun 15 01:56:00 2022
From: jsfaraway at gmail.com (Runfeng Jin)
Date: Wed, 15 Jun 2022 14:56:00 +0800
Subject: [petsc-users] SLEPc EPSGD: too much time in single iteration
In-Reply-To: <873EFD58-3EB4-4FF5-97B7-E2AE8A60D17B@dsic.upv.es>
References: <FD844D29-1CB7-4E1D-AEC6-80F3DAB9F670@dsic.upv.es>
	<EDBC719A-0974-446B-ADD2-0F101CDB691F@getmailspring.com>
	<0FE9EE1A-2CD4-44A2-ACD0-F6F4A466457A@dsic.upv.es>
	<CAMhO6e2GeOugEVtSyq1kJszr=CZ=3o7mhTPF-TMGo+rRcuc=vQ@mail.gmail.com>
	<CAMhO6e3Xy5iWVWTq-4=OAHn8KZKkzF_G24ad5F2TWqZXuJio0A@mail.gmail.com>
	<873EFD58-3EB4-4FF5-97B7-E2AE8A60D17B@dsic.upv.es>
Message-ID: <CAMhO6e0Gfn6SAAweLYotGPuXV=yGh6v8VsMZ=mwB8MjCCAQ6VQ@mail.gmail.com>

Hi! You are right!  I try to use a SLEPc and PETSc version with nodebug,
and the matrix B's solver time become 99s. But It is still a little higher
than matrix A(8s). Same as mentioned before, attachment is log view of
no-debug version:
   file 1:  log of matrix A solver. This is a larger
matrix(900,000*900,000) but solved quickly(8s);
   file 2: log of matix B solver. This is a smaller matrix(2,547*2,547) but
solved much slower(99s).

By comparing these two files,  the strang phenomenon still exist:
1) Matrix A has more basis vectors(375) than B(189), but A spent less time
on BVCreate(0.6s) than B(32s);
2) Matrix A spent less time on EPSSetup(0.015s) than B(0.9s)
3) In debug version, matrix B distribute much more unbalancedly storage
among processors(memory max/min 4365) than A(memory max/min 1.113), but
other metrics seems more balanced. And in no-debug version there is no
memory information output.

The significant difference I can tell is :1) B use preallocation; 2) A's
matrix elements are calculated by CPU, while B's matrix elements are
calculated by GPU and then transfered to CPU and solved by PETSc in CPU.

Does this is a normal result? I mean, the matrix with less non-zero
elements and less dimension can cost more epssolve time? Is this due to the
structure of matrix? IF so, is there any ways to increase the solve speed?

Or this is weired and should  be fixed by some ways?
Thank you!

Runfeng Jin


Jose E. Roman <jroman at dsic.upv.es> ?2022?6?12??? 16:08???

> Please always respond to the list.
>
> Pay attention to the warnings in the log:
>
>       ##########################################################
>       #                                                        #
>       #                       WARNING!!!                       #
>       #                                                        #
>       #   This code was compiled with a debugging option.      #
>       #   To get timing results run ./configure                #
>       #   using --with-debugging=no, the performance will      #
>       #   be generally two or three times faster.              #
>       #                                                        #
>       ##########################################################
>
> With the debugging option the times are not trustworthy, so I suggest
> repeating the analysis with an optimized build.
>
> Jose
>
>
> > El 12 jun 2022, a las 5:41, Runfeng Jin <jsfaraway at gmail.com> escribi?:
> >
> > Hello!
> >  I compare these two matrix solver's log view and find some strange
> thing. Attachment files are the log view.:
> >    file 1:  log of matrix A solver. This is a larger
> matrix(900,000*900,000) but solved quickly(30s);
> >    file 2: log of matix B solver. This is a smaller matrix(2,547*2,547 ,
> a little different from the matrix B that is mentioned in initial email,
> but solved much slower too. I use this for a quicker test) but solved much
> slower(1244s).
> >
> > By comparing these two files, I find some thing:
> > 1) Matrix A has more basis vectors(375) than B(189), but A spent less
> time on BVCreate(0.349s) than B(296s);
> > 2) Matrix A spent less time on EPSSetup(0.031s) than B(10.709s)
> > 3) Matrix B distribute much more unbalancedly storage among
> processors(memory max/min 4365) than A(memory max/min 1.113), but other
> metrics seems more balanced.
> >
> > I don't do prealocation in A, and it is distributed across processors by
> PETSc. For B , when preallocation I use PetscSplitOwnership to decide which
> part belongs to local processor, and B is also distributed by PETSc when
> compute matrix values.
> >
> > - Does this mean, for matrix B, too much nonzero elements are stored in
> single process, and this is why it cost too much more time in solving the
> matrix and find eigenvalues? If so,  are there some better ways to
> distribute the matrix among processors?
> > - Or are there any else reasons for this difference in cost time?
> >
> > Hope to recieve your reply, thank you!
> >
> > Runfeng Jin
> >
> >
> >
> > Runfeng Jin <jsfaraway at gmail.com> ?2022?6?11??? 20:33???
> > Hello!
> > I have try ues PETSC_DEFAULT for eps_ncv, but it still cost much time.
> Is there anything else I can do? Attachment is log when use PETSC_DEFAULT
> for eps_ncv.
> >
> > Thank you !
> >
> > Runfeng Jin
> >
> > Jose E. Roman <jroman at dsic.upv.es> ?2022?6?10??? 20:50???
> > The value -eps_ncv 5000 is huge.
> > Better let SLEPc use the default value.
> >
> > Jose
> >
> >
> > > El 10 jun 2022, a las 14:24, Jin Runfeng <jsfaraway at gmail.com>
> escribi?:
> > >
> > > Hello!
> > >  I want to acquire the 3 smallest eigenvalue, and attachment is the
> log  view output. I can see epssolve really cost the major time. But I can
> not see why it cost so much time. Can you see something from it?
> > >
> > > Thank you !
> > >
> > > Runfeng Jin
> > >
> > > On 6? 4 2022, at 1:37 ??, Jose E. Roman <jroman at dsic.upv.es> wrote:
> > > Convergence depends on distribution of eigenvalues you want to
> compute. On the other hand, the cost also depends on the time it takes to
> build the preconditioner. Use -log_view to see the cost of the different
> steps of the computation.
> > >
> > > Jose
> > >
> > >
> > > > El 3 jun 2022, a las 18:50, jsfaraway <jsfaraway at gmail.com>
> escribi?:
> > > >
> > > > hello!
> > > >
> > > > I am trying to use epsgd compute matrix's one smallest eigenvalue.
> And I find a strang thing. There are two matrix A(900000*900000) and
> B(90000*90000). While solve A use 371 iterations and only 30.83s, solve B
> use 22 iterations and 38885s! What could be the reason for this? Or what
> can I do to find the reason?
> > > >
> > > > I use" -eps_type gd -eps_ncv 300 -eps_nev 3 -eps_smallest_real ".
> > > > And there is one difference I can tell is matrix B has many small
> value, whose absolute value is less than 10-6. Could this be the reason?
> > > >
> > > > Thank you!
> > > >
> > > > Runfeng Jin
> > > <log_view.txt>
> >
> > <File2_lower-But-Smaller-Matrix.txt><File1_fatesr-But-Larger-MATRIX.txt>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220615/4c74f29f/attachment.html>

From jsfaraway at gmail.com  Wed Jun 15 01:58:32 2022
From: jsfaraway at gmail.com (Runfeng Jin)
Date: Wed, 15 Jun 2022 14:58:32 +0800
Subject: [petsc-users] SLEPc EPSGD: too much time in single iteration
In-Reply-To: <CAMhO6e0Gfn6SAAweLYotGPuXV=yGh6v8VsMZ=mwB8MjCCAQ6VQ@mail.gmail.com>
References: <FD844D29-1CB7-4E1D-AEC6-80F3DAB9F670@dsic.upv.es>
	<EDBC719A-0974-446B-ADD2-0F101CDB691F@getmailspring.com>
	<0FE9EE1A-2CD4-44A2-ACD0-F6F4A466457A@dsic.upv.es>
	<CAMhO6e2GeOugEVtSyq1kJszr=CZ=3o7mhTPF-TMGo+rRcuc=vQ@mail.gmail.com>
	<CAMhO6e3Xy5iWVWTq-4=OAHn8KZKkzF_G24ad5F2TWqZXuJio0A@mail.gmail.com>
	<873EFD58-3EB4-4FF5-97B7-E2AE8A60D17B@dsic.upv.es>
	<CAMhO6e0Gfn6SAAweLYotGPuXV=yGh6v8VsMZ=mwB8MjCCAQ6VQ@mail.gmail.com>
Message-ID: <CAMhO6e3E3JiKEFhzP_2sCH7DK-JjrFzXRCvn21yX4bJmm99fqA@mail.gmail.com>

Sorry ,I miss the attachment.

Runfeng Jin

Runfeng Jin <jsfaraway at gmail.com> ?2022?6?15??? 14:56???

> Hi! You are right!  I try to use a SLEPc and PETSc version with nodebug,
> and the matrix B's solver time become 99s. But It is still a little higher
> than matrix A(8s). Same as mentioned before, attachment is log view of
> no-debug version:
>    file 1:  log of matrix A solver. This is a larger
> matrix(900,000*900,000) but solved quickly(8s);
>    file 2: log of matix B solver. This is a smaller matrix(2,547*2,547)
> but solved much slower(99s).
>
> By comparing these two files,  the strang phenomenon still exist:
> 1) Matrix A has more basis vectors(375) than B(189), but A spent less time
> on BVCreate(0.6s) than B(32s);
> 2) Matrix A spent less time on EPSSetup(0.015s) than B(0.9s)
> 3) In debug version, matrix B distribute much more unbalancedly storage
> among processors(memory max/min 4365) than A(memory max/min 1.113), but
> other metrics seems more balanced. And in no-debug version there is no
> memory information output.
>
> The significant difference I can tell is :1) B use preallocation; 2) A's
> matrix elements are calculated by CPU, while B's matrix elements are
> calculated by GPU and then transfered to CPU and solved by PETSc in CPU.
>
> Does this is a normal result? I mean, the matrix with less non-zero
> elements and less dimension can cost more epssolve time? Is this due to the
> structure of matrix? IF so, is there any ways to increase the solve speed?
>
> Or this is weired and should  be fixed by some ways?
> Thank you!
>
> Runfeng Jin
>
>
> Jose E. Roman <jroman at dsic.upv.es> ?2022?6?12??? 16:08???
>
>> Please always respond to the list.
>>
>> Pay attention to the warnings in the log:
>>
>>       ##########################################################
>>       #                                                        #
>>       #                       WARNING!!!                       #
>>       #                                                        #
>>       #   This code was compiled with a debugging option.      #
>>       #   To get timing results run ./configure                #
>>       #   using --with-debugging=no, the performance will      #
>>       #   be generally two or three times faster.              #
>>       #                                                        #
>>       ##########################################################
>>
>> With the debugging option the times are not trustworthy, so I suggest
>> repeating the analysis with an optimized build.
>>
>> Jose
>>
>>
>> > El 12 jun 2022, a las 5:41, Runfeng Jin <jsfaraway at gmail.com> escribi?:
>> >
>> > Hello!
>> >  I compare these two matrix solver's log view and find some strange
>> thing. Attachment files are the log view.:
>> >    file 1:  log of matrix A solver. This is a larger
>> matrix(900,000*900,000) but solved quickly(30s);
>> >    file 2: log of matix B solver. This is a smaller matrix(2,547*2,547
>> , a little different from the matrix B that is mentioned in initial email,
>> but solved much slower too. I use this for a quicker test) but solved much
>> slower(1244s).
>> >
>> > By comparing these two files, I find some thing:
>> > 1) Matrix A has more basis vectors(375) than B(189), but A spent less
>> time on BVCreate(0.349s) than B(296s);
>> > 2) Matrix A spent less time on EPSSetup(0.031s) than B(10.709s)
>> > 3) Matrix B distribute much more unbalancedly storage among
>> processors(memory max/min 4365) than A(memory max/min 1.113), but other
>> metrics seems more balanced.
>> >
>> > I don't do prealocation in A, and it is distributed across processors
>> by PETSc. For B , when preallocation I use PetscSplitOwnership to decide
>> which part belongs to local processor, and B is also distributed by PETSc
>> when compute matrix values.
>> >
>> > - Does this mean, for matrix B, too much nonzero elements are stored in
>> single process, and this is why it cost too much more time in solving the
>> matrix and find eigenvalues? If so,  are there some better ways to
>> distribute the matrix among processors?
>> > - Or are there any else reasons for this difference in cost time?
>> >
>> > Hope to recieve your reply, thank you!
>> >
>> > Runfeng Jin
>> >
>> >
>> >
>> > Runfeng Jin <jsfaraway at gmail.com> ?2022?6?11??? 20:33???
>> > Hello!
>> > I have try ues PETSC_DEFAULT for eps_ncv, but it still cost much time.
>> Is there anything else I can do? Attachment is log when use PETSC_DEFAULT
>> for eps_ncv.
>> >
>> > Thank you !
>> >
>> > Runfeng Jin
>> >
>> > Jose E. Roman <jroman at dsic.upv.es> ?2022?6?10??? 20:50???
>> > The value -eps_ncv 5000 is huge.
>> > Better let SLEPc use the default value.
>> >
>> > Jose
>> >
>> >
>> > > El 10 jun 2022, a las 14:24, Jin Runfeng <jsfaraway at gmail.com>
>> escribi?:
>> > >
>> > > Hello!
>> > >  I want to acquire the 3 smallest eigenvalue, and attachment is the
>> log  view output. I can see epssolve really cost the major time. But I can
>> not see why it cost so much time. Can you see something from it?
>> > >
>> > > Thank you !
>> > >
>> > > Runfeng Jin
>> > >
>> > > On 6? 4 2022, at 1:37 ??, Jose E. Roman <jroman at dsic.upv.es> wrote:
>> > > Convergence depends on distribution of eigenvalues you want to
>> compute. On the other hand, the cost also depends on the time it takes to
>> build the preconditioner. Use -log_view to see the cost of the different
>> steps of the computation.
>> > >
>> > > Jose
>> > >
>> > >
>> > > > El 3 jun 2022, a las 18:50, jsfaraway <jsfaraway at gmail.com>
>> escribi?:
>> > > >
>> > > > hello!
>> > > >
>> > > > I am trying to use epsgd compute matrix's one smallest eigenvalue.
>> And I find a strang thing. There are two matrix A(900000*900000) and
>> B(90000*90000). While solve A use 371 iterations and only 30.83s, solve B
>> use 22 iterations and 38885s! What could be the reason for this? Or what
>> can I do to find the reason?
>> > > >
>> > > > I use" -eps_type gd -eps_ncv 300 -eps_nev 3 -eps_smallest_real ".
>> > > > And there is one difference I can tell is matrix B has many small
>> value, whose absolute value is less than 10-6. Could this be the reason?
>> > > >
>> > > > Thank you!
>> > > >
>> > > > Runfeng Jin
>> > > <log_view.txt>
>> >
>> > <File2_lower-But-Smaller-Matrix.txt><File1_fatesr-But-Larger-MATRIX.txt>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220615/897e90a8/attachment-0001.html>
-------------- next part --------------
************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

/public/home/jrf/works/ecMRCI-shaula/MRCI on a  named g16r3n07 with 256 processors, by jrf Wed Jun 15 10:04:00 2022
Using Petsc Release Version 3.15.1, Jun 17, 2021 

                         Max       Max/Min     Avg       Total
Time (sec):           1.029e+02     1.001   1.028e+02
Objects:              2.011e+03     1.146   1.761e+03
Flop:                 1.574e+06     2.099   1.104e+06  2.827e+08
Flop/sec:             1.531e+04     2.099   1.074e+04  2.748e+06
MPI Messages:         3.881e+04     7.920   1.865e+04  4.773e+06
MPI Message Lengths:  1.454e+06     6.190   3.542e+01  1.691e+08
MPI Reductions:       1.791e+03     1.001

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flop
                            and VecAXPY() for complex vectors of length N --> 8N flop

Summary of Stages:   ----- Time ------  ----- Flop ------  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total    Count   %Total     Avg         %Total    Count   %Total
 0:      Main Stage: 1.0285e+02 100.0%  2.8266e+08 100.0%  4.773e+06 100.0%  3.542e+01      100.0%  1.769e+03  98.9%

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flop: Max - maximum over all processors
                  Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   AvgLen: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flop in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flop                              --- Global ---  --- Stage ----  Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   AvgLen  Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

BuildTwoSided          2 1.0 4.0572e-01 2.6 0.00e+00 0.0 3.7e+04 4.0e+00 2.0e+00  0  0  1  0  0   0  0  1  0  0     0
BuildTwoSidedF         1 1.0 2.0986e-01 2.6 0.00e+00 0.0 2.4e+04 1.1e+02 1.0e+00  0  0  1  2  0   0  0  1  2  0     0
MatMult              193 1.0 4.6531e+00 1.1 9.85e+05 4.3 4.7e+06 3.5e+01 1.0e+00  4 48 99 98  0   4 48 99 98  0    29
MatSolve             377 1.0 1.6183e-0288.8 7.16e+04 6.6 0.0e+00 0.0e+00 0.0e+00  0  6  0  0  0   0  6  0  0  0   982
MatLUFactorNum         1 1.0 5.7322e-05 2.8 6.21e+0222.2 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  2177
MatILUFactorSym        1 1.0 8.5668e-03793.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyBegin       1 1.0 2.1006e-01 2.6 0.00e+00 0.0 2.4e+04 1.1e+02 1.0e+00  0  0  1  2  0   0  0  1  2  0     0
MatAssemblyEnd         1 1.0 3.0272e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ            1 1.0 7.0000e-07 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 5.3758e-05 2.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatZeroEntries        98 1.0 9.0806e-0371.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNorm                3 1.0 1.8633e-01 1.8 6.00e+01 1.1 0.0e+00 0.0e+00 3.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecCopy              959 1.0 1.4909e-0275.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet               387 1.0 9.8578e-04 8.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY                3 1.0 1.6157e-023639.0 6.00e+01 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     1
VecScatterBegin      196 1.0 3.7129e-01 1.5 0.00e+00 0.0 4.7e+06 3.5e+01 4.0e+00  0  0 99 98  0   0  0 99 98  0     0
VecScatterEnd        196 1.0 4.6171e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  4  0  0  0  0   4  0  0  0  0     0
VecSetRandom           3 1.0 3.5271e-0515.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecReduceArith       634 1.0 1.7589e-0265.9 1.20e+04 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0   174
VecReduceComm        444 1.0 2.4972e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 4.4e+02 24  0  0  0 25  24  0  0  0 25     0
SFSetGraph             1 1.0 1.1170e-05 9.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFSetUp                4 1.0 2.3891e-01 1.3 0.00e+00 0.0 4.9e+04 1.1e+01 1.0e+00  0  0  1  0  0   0  0  1  0  0     0
SFPack               196 1.0 1.2023e-0291.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFUnpack             196 1.0 4.3491e-05 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
EPSSetUp               1 1.0 9.6815e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.7e+01  1  0  0  0  1   1  0  0  0  1     0
EPSSolve               1 1.0 9.9906e+01 1.0 1.56e+06 2.1 4.7e+06 3.5e+01 1.7e+03 97 99 98 97 97  97 99 98 97 99     3
STSetUp                1 1.0 2.8679e-0450.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
STComputeOperatr       1 1.0 2.0985e-04223.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
BVCreate             194 1.0 3.2437e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 5.8e+02 31  0  0  0 33  31  0  0  0 33     0
BVCopy               386 1.0 1.7107e-02110.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
BVMultVec           1090 1.0 1.8337e-0221.1 2.22e+05 1.1 0.0e+00 0.0e+00 0.0e+00  0 20  0  0  0   0 20  0  0  0  3080
BVMultInPlace        224 1.0 1.8273e-0218.4 1.06e+05 1.1 0.0e+00 0.0e+00 0.0e+00  0 10  0  0  0   0 10  0  0  0  1481
BVDot                319 1.0 1.7687e+01 1.1 1.11e+05 1.1 0.0e+00 0.0e+00 3.2e+02 17 10  0  0 18  17 10  0  0 18     2
BVDotVec             392 1.0 2.2083e+01 1.0 6.32e+04 1.1 0.0e+00 0.0e+00 3.9e+02 21  6  0  0 22  21  6  0  0 22     1
BVOrthogonalizeV     190 1.0 1.1538e+01 1.0 1.15e+05 1.1 0.0e+00 0.0e+00 2.0e+02 11 10  0  0 11  11 10  0  0 12     3
BVScale              254 1.0 1.7301e-02125.1 2.54e+03 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    37
BVSetRandom            3 1.0 3.6330e-0485.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
BVMatProject         255 1.0 1.7707e+01 1.1 1.11e+05 1.1 0.0e+00 0.0e+00 3.2e+02 17 10  0  0 18  17 10  0  0 18     2
DSSolve               82 1.0 5.4953e-0215.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
DSVectors            380 1.0 9.7683e-0366.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
DSOther              179 1.0 1.7321e-0239.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSetUp               1 1.0 1.8680e-0574.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve             377 1.0 1.8723e-0214.2 7.16e+04 6.6 0.0e+00 0.0e+00 0.0e+00  0  6  0  0  0   0  6  0  0  0   849
PCSetUp                2 1.0 8.8937e-0353.4 6.21e+0222.2 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    14
PCApply              377 1.0 3.0085e-0213.8 7.23e+04 6.6 0.0e+00 0.0e+00 0.0e+00  0  6  0  0  0   0  6  0  0  0   532
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Matrix   745            745      2664464     0.
              Vector   793            793      1538360     0.
           Index Set    10             10        10792     0.
   Star Forest Graph     4              4         5376     0.
          EPS Solver     1              1         3468     0.
  Spectral Transform     1              1          908     0.
       Basis Vectors   195            195       437744     0.
              Region     1              1          680     0.
       Direct Solver     1              1        20156     0.
       Krylov Solver     2              2         3200     0.
      Preconditioner     2              2         1936     0.
         PetscRandom     1              1          670     0.
              Viewer     1              0            0     0.
========================================================================================================================
Average time to get PetscTime(): 4.7e-08
Average time for MPI_Barrier(): 0.0578456

Average time for zero size MPI_Send(): 0.00358668
#PETSc Option Table entries:
-eps_gd_blocksize 3
-eps_gd_initial_size 3
-eps_ncv PETSC_DEFAULT
-eps_type gd
-log_view
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 --with-blaslapack=1 --with-blaslapack-dir=/public/software/compiler/intel/oneapi/mkl/2021.3.0 --with-64-bit-blas-indices=0 --with-boost=1 --with-boost-dir=/public/home/jrf/tools/boost_1_73_0/gcc7.3.1 --prefix=/public/home/jrf/tools/petsc3.15.1/gcc7.3.1-32indices-nodebug --with-valgrind-dir=/public/home/jrf/tools/valgrind --LDFLAGS=-Wl,-rpath=/opt/rh/devtoolset-7/root/usr/lib64 -Wl,-rpath=/opt/rh/devtoolset-7/root/usr/lib --with-64-bit-indices=0 --with-petsc-arch=gcc7.3.1-32indices-nodebug --with-debugging=no
-----------------------------------------
Libraries compiled on 2022-06-14 01:43:59 on login05 
Machine characteristics: Linux-3.10.0-957.el7.x86_64-x86_64-with-centos
Using PETSc directory: /public/home/jrf/tools/petsc3.15.1/gcc7.3.1-32indices-nodebug
Using PETSc arch: 
-----------------------------------------

Using C compiler: mpicc  -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g -O   
Using Fortran compiler: mpif90  -fPIC -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g -O     
-----------------------------------------

Using include paths: -I/public/home/jrf/tools/petsc3.15.1/gcc7.3.1-32indices-nodebug/include -I/public/home/jrf/tools/boost_1_73_0/gcc7.3.1/include -I/public/home/jrf/tools/valgrind/include
-----------------------------------------

Using C linker: mpicc
Using Fortran linker: mpif90
Using libraries: -Wl,-rpath,/public/home/jrf/tools/petsc3.15.1/gcc7.3.1-32indices-nodebug/lib -L/public/home/jrf/tools/petsc3.15.1/gcc7.3.1-32indices-nodebug/lib -lpetsc -Wl,-rpath,/public/software/compiler/intel/oneapi/mkl/2021.3.0/lib/intel64 -L/public/software/compiler/intel/oneapi/mkl/2021.3.0/lib/intel64 -Wl,-rpath,/opt/hpc/software/mpi/hwloc/lib -L/opt/hpc/software/mpi/hwloc/lib -Wl,-rpath,/opt/hpc/software/mpi/hpcx/v2.7.4/gcc-7.3.1/lib -L/opt/hpc/software/mpi/hpcx/v2.7.4/gcc-7.3.1/lib -Wl,-rpath,/opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7 -L/opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7 -Wl,-rpath,/opt/rh/devtoolset-7/root/usr/lib64 -L/opt/rh/devtoolset-7/root/usr/lib64 -Wl,-rpath,/opt/hpc/software/mpi/hpcx/v2.7.4/sharp/lib -L/opt/hpc/software/mpi/hpcx/v2.7.4/sharp/lib -Wl,-rpath,/opt/hpc/software/mpi/hpcx/v2.7.4/hcoll/lib -L/opt/hpc/software/mpi/hpcx/v2.7.4/hcoll/lib -Wl,-rpath,/opt/hpc/software/mpi/hpcx/v2.7.4/ucx_without_rocm/lib -L/opt/hpc/software/mpi/hpcx/v2.7.4/ucx_without_rocm/lib -Wl,-rpath,/opt/rh/devtoolset-7/root/usr/lib -L/opt/rh/devtoolset-7/root/usr/lib -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lpthread -lm -lX11 -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl
-----------------------------------------
-------------- next part --------------
************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

/public/home/jrf/works/qubic/bin/pfci.x on a  named h09r4n13 with 192 processors, by jrf Wed Jun 15 12:10:57 2022
Using Petsc Release Version 3.15.1, Jun 17, 2021 

                         Max       Max/Min     Avg       Total
Time (sec):           9.703e+02     1.000   9.703e+02
Objects:              2.472e+03     1.000   2.472e+03
Flop:                 6.278e+09     1.064   6.012e+09  1.154e+12
Flop/sec:             6.470e+06     1.064   6.196e+06  1.190e+09
MPI Messages:         3.635e+04     1.947   2.755e+04  5.290e+06
MPI Message Lengths:  7.246e+08     1.742   2.052e+04  1.085e+11
MPI Reductions:       2.464e+03     1.000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flop
                            and VecAXPY() for complex vectors of length N --> 8N flop

Summary of Stages:   ----- Time ------  ----- Flop ------  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total    Count   %Total     Avg         %Total    Count   %Total
 0:      Main Stage: 9.7032e+02 100.0%  1.1543e+12 100.0%  5.290e+06 100.0%  2.052e+04      100.0%  2.446e+03  99.3%

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flop: Max - maximum over all processors
                  Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   AvgLen: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flop in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flop                              --- Global ---  --- Stage ----  Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   AvgLen  Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

BuildTwoSided          2 1.0 1.9883e+029876.1 0.00e+00 0.0 2.1e+04 4.0e+00 2.0e+00 11  0  0  0  0  11  0  0  0  0     0
BuildTwoSidedF         1 1.0 1.9879e+021349804.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 11  0  0  0  0  11  0  0  0  0     0
MatMult              247 1.0 2.5963e+00 1.6 1.16e+09 1.2 5.3e+06 2.1e+04 1.0e+00  0 17100100  0   0 17100100  0 77449
MatSolve             479 1.0 3.2541e-01 2.3 3.89e+08 2.2 0.0e+00 0.0e+00 0.0e+00  0  4  0  0  0   0  4  0  0  0 146312
MatLUFactorNum         1 1.0 4.3923e-02 7.0 2.24e+07 4.9 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 41413
MatILUFactorSym        1 1.0 2.5215e-03 4.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyBegin       1 1.0 1.9879e+02654719.8 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 11  0  0  0  0  11  0  0  0  0     0
MatAssemblyEnd         1 1.0 2.1247e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 5.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ            1 1.0 8.3000e-07 5.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 1.0741e-04 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatZeroEntries       244 1.0 2.5375e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNorm                3 1.0 1.3600e-0125.4 2.83e+04 1.0 0.0e+00 0.0e+00 3.0e+00  0  0  0  0  0   0  0  0  0  0    40
VecCopy             1214 1.0 6.8012e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet               486 1.0 2.4261e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY                3 1.0 1.5987e-04 3.8 2.83e+04 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 34009
VecScatterBegin      247 1.0 3.8039e-01 2.2 0.00e+00 0.0 5.3e+06 2.1e+04 1.0e+00  0  0100100  0   0  0100100  0     0
VecScatterEnd        247 1.0 1.3181e+00 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSetRandom           6 1.0 1.3014e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecReduceArith       723 1.0 5.9514e-03 2.1 6.82e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 220153
VecReduceComm        482 1.0 2.1629e-01 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 4.8e+02  0  0  0  0 20   0  0  0  0 20     0
SFSetGraph             1 1.0 1.3207e-03 5.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFSetUp                1 1.0 8.3540e-02 1.4 0.00e+00 0.0 4.2e+04 5.2e+03 1.0e+00  0  0  1  0  0   0  0  1  0  0     0
SFPack               247 1.0 2.3981e-01 3.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFUnpack             247 1.0 1.8351e-04 2.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
EPSSetUp               1 1.0 1.5565e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.7e+01  0  0  0  0  1   0  0  0  0  1     0
EPSSolve               1 1.0 8.5090e+00 1.0 6.26e+09 1.1 5.2e+06 2.1e+04 2.4e+03  1100 99 99 99   1100 99 99 99 135365
STSetUp                1 1.0 1.3724e-04 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
STComputeOperatr       1 1.0 7.1348e-05 5.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
BVCreate             245 1.0 6.2414e-01 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 7.4e+02  0  0  0  0 30   0  0  0  0 30     0
BVCopy               488 1.0 1.9780e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
BVMultVec           1210 1.0 6.7882e-01 1.1 1.14e+09 1.0 0.0e+00 0.0e+00 0.0e+00  0 19  0  0  0   0 19  0  0  0 321786
BVMultInPlace        247 1.0 7.8465e-01 1.6 2.71e+09 1.0 0.0e+00 0.0e+00 0.0e+00  0 45  0  0  0   0 45  0  0  0 663459
BVDot                718 1.0 1.7888e+00 2.0 5.64e+08 1.0 0.0e+00 0.0e+00 7.2e+02  0  9  0  0 29   0  9  0  0 29 60566
BVDotVec             487 1.0 5.3124e-01 1.2 2.85e+08 1.0 0.0e+00 0.0e+00 4.9e+02  0  5  0  0 20   0  5  0  0 20 102853
BVOrthogonalizeV     244 1.0 5.6093e-01 1.0 5.62e+08 1.0 0.0e+00 0.0e+00 2.5e+02  0  9  0  0 10   0  9  0  0 10 192477
BVScale              482 1.0 2.0062e-03 1.7 2.28e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 217721
BVSetRandom            6 1.0 1.3480e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
BVMatProject         480 1.0 1.8300e+00 2.0 5.64e+08 1.0 0.0e+00 0.0e+00 7.2e+02  0  9  0  0 29   0  9  0  0 29 59203
DSSolve              242 1.0 2.3012e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
DSVectors            482 1.0 4.5230e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
DSOther              485 1.0 2.4384e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSetUp               1 1.0 3.5111e-05 2.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve             479 1.0 3.3062e-01 2.2 3.89e+08 2.2 0.0e+00 0.0e+00 0.0e+00  0  4  0  0  0   0  4  0  0  0 144006
PCSetUp                2 1.0 4.6721e-02 6.0 2.24e+07 4.9 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 38932
PCApply              479 1.0 3.7933e-01 2.4 4.11e+08 2.3 0.0e+00 0.0e+00 0.0e+00  0  4  0  0  0   0  4  0  0  0 130309
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Matrix  1216           1216     91546008     0.
              Vector   994            994     83668472     0.
           Index Set     5              5       733636     0.
   Star Forest Graph     1              1         1224     0.
          EPS Solver     1              1        13512     0.
  Spectral Transform     1              1          908     0.
       Basis Vectors   246            246       785872     0.
              Region     1              1          680     0.
       Direct Solver     1              1      3617024     0.
       Krylov Solver     2              2         3200     0.
      Preconditioner     2              2         1936     0.
         PetscRandom     1              1          670     0.
              Viewer     1              0            0     0.
========================================================================================================================
Average time to get PetscTime(): 5e-08
Average time for MPI_Barrier(): 1.90986e-05
Average time for zero size MPI_Send(): 3.44587e-06
#PETSc Option Table entries:
-eps_ncv 300
-eps_nev 3
-eps_smallest_real
-eps_tol 1e-10
-eps_type gd
-log_view
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 --with-blaslapack=1 --with-blaslapack-dir=/public/software/compiler/intel/oneapi/mkl/2021.3.0 --with-64-bit-blas-indices=0 --with-boost=1 --with-boost-dir=/public/home/jrf/tools/boost_1_73_0/gcc7.3.1 --prefix=/public/home/jrf/tools/petsc3.15.1/gcc7.3.1-32indices-nodebug --with-valgrind-dir=/public/home/jrf/tools/valgrind --LDFLAGS=-Wl,-rpath=/opt/rh/devtoolset-7/root/usr/lib64 -Wl,-rpath=/opt/rh/devtoolset-7/root/usr/lib --with-64-bit-indices=0 --with-petsc-arch=gcc7.3.1-32indices-nodebug --with-debugging=no
-----------------------------------------
Libraries compiled on 2022-06-14 01:43:59 on login05 
Machine characteristics: Linux-3.10.0-957.el7.x86_64-x86_64-with-centos
Using PETSc directory: /public/home/jrf/tools/petsc3.15.1/gcc7.3.1-32indices-nodebug
Using PETSc arch: 
-----------------------------------------

Using C compiler: mpicc  -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g -O   
Using Fortran compiler: mpif90  -fPIC -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g -O     
-----------------------------------------

Using include paths: -I/public/home/jrf/tools/petsc3.15.1/gcc7.3.1-32indices-nodebug/include -I/public/home/jrf/tools/boost_1_73_0/gcc7.3.1/include -I/public/home/jrf/tools/valgrind/include
-----------------------------------------

Using C linker: mpicc
Using Fortran linker: mpif90
Using libraries: -Wl,-rpath,/public/home/jrf/tools/petsc3.15.1/gcc7.3.1-32indices-nodebug/lib -L/public/home/jrf/tools/petsc3.15.1/gcc7.3.1-32indices-nodebug/lib -lpetsc -Wl,-rpath,/public/software/compiler/intel/oneapi/mkl/2021.3.0/lib/intel64 -L/public/software/compiler/intel/oneapi/mkl/2021.3.0/lib/intel64 -Wl,-rpath,/opt/hpc/software/mpi/hwloc/lib -L/opt/hpc/software/mpi/hwloc/lib -Wl,-rpath,/opt/hpc/software/mpi/hpcx/v2.7.4/gcc-7.3.1/lib -L/opt/hpc/software/mpi/hpcx/v2.7.4/gcc-7.3.1/lib -Wl,-rpath,/opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7 -L/opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7 -Wl,-rpath,/opt/rh/devtoolset-7/root/usr/lib64 -L/opt/rh/devtoolset-7/root/usr/lib64 -Wl,-rpath,/opt/hpc/software/mpi/hpcx/v2.7.4/sharp/lib -L/opt/hpc/software/mpi/hpcx/v2.7.4/sharp/lib -Wl,-rpath,/opt/hpc/software/mpi/hpcx/v2.7.4/hcoll/lib -L/opt/hpc/software/mpi/hpcx/v2.7.4/hcoll/lib -Wl,-rpath,/opt/hpc/software/mpi/hpcx/v2.7.4/ucx_without_rocm/lib -L/opt/hpc/software/mpi/hpcx/v2.7.4/ucx_without_rocm/lib -Wl,-rpath,/opt/rh/devtoolset-7/root/usr/lib -L/opt/rh/devtoolset-7/root/usr/lib -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lpthread -lm -lX11 -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl
-----------------------------------------

From jroman at dsic.upv.es  Wed Jun 15 03:09:01 2022
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Wed, 15 Jun 2022 10:09:01 +0200
Subject: [petsc-users] SLEPc EPSGD: too much time in single iteration
In-Reply-To: <CAMhO6e3E3JiKEFhzP_2sCH7DK-JjrFzXRCvn21yX4bJmm99fqA@mail.gmail.com>
References: <FD844D29-1CB7-4E1D-AEC6-80F3DAB9F670@dsic.upv.es>
	<EDBC719A-0974-446B-ADD2-0F101CDB691F@getmailspring.com>
	<0FE9EE1A-2CD4-44A2-ACD0-F6F4A466457A@dsic.upv.es>
	<CAMhO6e2GeOugEVtSyq1kJszr=CZ=3o7mhTPF-TMGo+rRcuc=vQ@mail.gmail.com>
	<CAMhO6e3Xy5iWVWTq-4=OAHn8KZKkzF_G24ad5F2TWqZXuJio0A@mail.gmail.com>
	<873EFD58-3EB4-4FF5-97B7-E2AE8A60D17B@dsic.upv.es>
	<CAMhO6e0Gfn6SAAweLYotGPuXV=yGh6v8VsMZ=mwB8MjCCAQ6VQ@mail.gmail.com>
	<CAMhO6e3E3JiKEFhzP_2sCH7DK-JjrFzXRCvn21yX4bJmm99fqA@mail.gmail.com>
Message-ID: <14570A42-226E-49BB-9074-850258860ACC@dsic.upv.es>

You are comparing two different codes on two different machines? Or is it the same machine? with different number of processes and different solver options...

If it is the same machine, the performance seems very different:

Matrix A:
Average time for MPI_Barrier(): 1.90986e-05
Average time for zero size MPI_Send(): 3.44587e-06

Matrix B:
Average time for MPI_Barrier(): 0.0578456
Average time for zero size MPI_Send(): 0.00358668

The reductions (VecReduceComm) are taking 2.1629e-01 and 2.4972e+01, respectively. It's a two orders of magnitude difference.

Jose


> El 15 jun 2022, a las 8:58, Runfeng Jin <jsfaraway at gmail.com> escribi?:
> 
> Sorry ,I miss the attachment.
> 
> Runfeng Jin
> 
> Runfeng Jin <jsfaraway at gmail.com> ?2022?6?15??? 14:56???
> Hi! You are right!  I try to use a SLEPc and PETSc version with nodebug, and the matrix B's solver time become 99s. But It is still a little higher than matrix A(8s). Same as mentioned before, attachment is log view of no-debug version:
>    file 1:  log of matrix A solver. This is a larger matrix(900,000*900,000) but solved quickly(8s);
>    file 2: log of matix B solver. This is a smaller matrix(2,547*2,547) but solved much slower(99s).
> 
> By comparing these two files,  the strang phenomenon still exist:
> 1) Matrix A has more basis vectors(375) than B(189), but A spent less time on BVCreate(0.6s) than B(32s);
> 2) Matrix A spent less time on EPSSetup(0.015s) than B(0.9s)
> 3) In debug version, matrix B distribute much more unbalancedly storage among processors(memory max/min 4365) than A(memory max/min 1.113), but other metrics seems more balanced. And in no-debug version there is no memory information output.
> 
> The significant difference I can tell is :1) B use preallocation; 2) A's matrix elements are calculated by CPU, while B's matrix elements are calculated by GPU and then transfered to CPU and solved by PETSc in CPU.
> 
> Does this is a normal result? I mean, the matrix with less non-zero elements and less dimension can cost more epssolve time? Is this due to the structure of matrix? IF so, is there any ways to increase the solve speed?
> 
> Or this is weired and should  be fixed by some ways?
> Thank you!
> 
> Runfeng Jin
>   
> 
> Jose E. Roman <jroman at dsic.upv.es> ?2022?6?12??? 16:08???
> Please always respond to the list.
> 
> Pay attention to the warnings in the log:
> 
>       ##########################################################
>       #                                                        #
>       #                       WARNING!!!                       #
>       #                                                        #
>       #   This code was compiled with a debugging option.      #
>       #   To get timing results run ./configure                #
>       #   using --with-debugging=no, the performance will      #
>       #   be generally two or three times faster.              #
>       #                                                        #
>       ##########################################################
> 
> With the debugging option the times are not trustworthy, so I suggest repeating the analysis with an optimized build.
> 
> Jose
> 
> 
> > El 12 jun 2022, a las 5:41, Runfeng Jin <jsfaraway at gmail.com> escribi?:
> > 
> > Hello!
> >  I compare these two matrix solver's log view and find some strange thing. Attachment files are the log view.:
> >    file 1:  log of matrix A solver. This is a larger matrix(900,000*900,000) but solved quickly(30s);
> >    file 2: log of matix B solver. This is a smaller matrix(2,547*2,547 , a little different from the matrix B that is mentioned in initial email, but solved much slower too. I use this for a quicker test) but solved much slower(1244s).
> > 
> > By comparing these two files, I find some thing:
> > 1) Matrix A has more basis vectors(375) than B(189), but A spent less time on BVCreate(0.349s) than B(296s);
> > 2) Matrix A spent less time on EPSSetup(0.031s) than B(10.709s)
> > 3) Matrix B distribute much more unbalancedly storage among processors(memory max/min 4365) than A(memory max/min 1.113), but other metrics seems more balanced.
> > 
> > I don't do prealocation in A, and it is distributed across processors by PETSc. For B , when preallocation I use PetscSplitOwnership to decide which part belongs to local processor, and B is also distributed by PETSc when compute matrix values. 
> > 
> > - Does this mean, for matrix B, too much nonzero elements are stored in single process, and this is why it cost too much more time in solving the matrix and find eigenvalues? If so,  are there some better ways to distribute the matrix among processors?  
> > - Or are there any else reasons for this difference in cost time?
> > 
> > Hope to recieve your reply, thank you!
> > 
> > Runfeng Jin
> > 
> > 
> > 
> > Runfeng Jin <jsfaraway at gmail.com> ?2022?6?11??? 20:33???
> > Hello!
> > I have try ues PETSC_DEFAULT for eps_ncv, but it still cost much time. Is there anything else I can do? Attachment is log when use PETSC_DEFAULT for eps_ncv.
> > 
> > Thank you !
> > 
> > Runfeng Jin
> > 
> > Jose E. Roman <jroman at dsic.upv.es> ?2022?6?10??? 20:50???
> > The value -eps_ncv 5000 is huge.
> > Better let SLEPc use the default value.
> > 
> > Jose
> > 
> > 
> > > El 10 jun 2022, a las 14:24, Jin Runfeng <jsfaraway at gmail.com> escribi?:
> > > 
> > > Hello!
> > >  I want to acquire the 3 smallest eigenvalue, and attachment is the log  view output. I can see epssolve really cost the major time. But I can not see why it cost so much time. Can you see something from it?
> > > 
> > > Thank you !
> > > 
> > > Runfeng Jin
> > > 
> > > On 6? 4 2022, at 1:37 ??, Jose E. Roman <jroman at dsic.upv.es> wrote:
> > > Convergence depends on distribution of eigenvalues you want to compute. On the other hand, the cost also depends on the time it takes to build the preconditioner. Use -log_view to see the cost of the different steps of the computation.
> > > 
> > > Jose
> > > 
> > > 
> > > > El 3 jun 2022, a las 18:50, jsfaraway <jsfaraway at gmail.com> escribi?:
> > > >
> > > > hello!
> > > >
> > > > I am trying to use epsgd compute matrix's one smallest eigenvalue. And I find a strang thing. There are two matrix A(900000*900000) and B(90000*90000). While solve A use 371 iterations and only 30.83s, solve B use 22 iterations and 38885s! What could be the reason for this? Or what can I do to find the reason?
> > > >
> > > > I use" -eps_type gd -eps_ncv 300 -eps_nev 3 -eps_smallest_real ".
> > > > And there is one difference I can tell is matrix B has many small value, whose absolute value is less than 10-6. Could this be the reason?
> > > >
> > > > Thank you!
> > > >
> > > > Runfeng Jin
> > > <log_view.txt>
> > 
> > <File2_lower-But-Smaller-Matrix.txt><File1_fatesr-But-Larger-MATRIX.txt>
> 
> <file2_nodebug_MatrixB.txt><file1_nodebug_MatrixA.txt>


From jsfaraway at gmail.com  Wed Jun 15 03:20:45 2022
From: jsfaraway at gmail.com (Runfeng Jin)
Date: Wed, 15 Jun 2022 16:20:45 +0800
Subject: [petsc-users] SLEPc EPSGD: too much time in single iteration
In-Reply-To: <14570A42-226E-49BB-9074-850258860ACC@dsic.upv.es>
References: <FD844D29-1CB7-4E1D-AEC6-80F3DAB9F670@dsic.upv.es>
	<EDBC719A-0974-446B-ADD2-0F101CDB691F@getmailspring.com>
	<0FE9EE1A-2CD4-44A2-ACD0-F6F4A466457A@dsic.upv.es>
	<CAMhO6e2GeOugEVtSyq1kJszr=CZ=3o7mhTPF-TMGo+rRcuc=vQ@mail.gmail.com>
	<CAMhO6e3Xy5iWVWTq-4=OAHn8KZKkzF_G24ad5F2TWqZXuJio0A@mail.gmail.com>
	<873EFD58-3EB4-4FF5-97B7-E2AE8A60D17B@dsic.upv.es>
	<CAMhO6e0Gfn6SAAweLYotGPuXV=yGh6v8VsMZ=mwB8MjCCAQ6VQ@mail.gmail.com>
	<CAMhO6e3E3JiKEFhzP_2sCH7DK-JjrFzXRCvn21yX4bJmm99fqA@mail.gmail.com>
	<14570A42-226E-49BB-9074-850258860ACC@dsic.upv.es>
Message-ID: <CAMhO6e01rKLBLy_L6GxHoTFRH1FM=y2bvSeRNrHvPiVW1S=ASg@mail.gmail.com>

Hi!
I use the same machine, same nodes and same processors per nodes. And I
test many times, so this seems not an accidental result. But your points do
inspire me. I use Global Array's communicator when solving matrix A, ang
just MPI_COMM_WORLD in B. In every node, Global Array's communicator
make one processor dedicated to  manage communicate, maybe this is the
reason for the difference in communicating speed?

I  will have a try and respond as soon as I get the result!

Runfeng Jin


Jose E. Roman <jroman at dsic.upv.es> ?2022?6?15??? 16:09???

> You are comparing two different codes on two different machines? Or is it
> the same machine? with different number of processes and different solver
> options...
>
> If it is the same machine, the performance seems very different:
>
> Matrix A:
> Average time for MPI_Barrier(): 1.90986e-05
> Average time for zero size MPI_Send(): 3.44587e-06
>
> Matrix B:
> Average time for MPI_Barrier(): 0.0578456
> Average time for zero size MPI_Send(): 0.00358668
>
> The reductions (VecReduceComm) are taking 2.1629e-01 and 2.4972e+01,
> respectively. It's a two orders of magnitude difference.
>
> Jose
>
>
> > El 15 jun 2022, a las 8:58, Runfeng Jin <jsfaraway at gmail.com> escribi?:
> >
> > Sorry ,I miss the attachment.
> >
> > Runfeng Jin
> >
> > Runfeng Jin <jsfaraway at gmail.com> ?2022?6?15??? 14:56???
> > Hi! You are right!  I try to use a SLEPc and PETSc version with nodebug,
> and the matrix B's solver time become 99s. But It is still a little higher
> than matrix A(8s). Same as mentioned before, attachment is log view of
> no-debug version:
> >    file 1:  log of matrix A solver. This is a larger
> matrix(900,000*900,000) but solved quickly(8s);
> >    file 2: log of matix B solver. This is a smaller matrix(2,547*2,547)
> but solved much slower(99s).
> >
> > By comparing these two files,  the strang phenomenon still exist:
> > 1) Matrix A has more basis vectors(375) than B(189), but A spent less
> time on BVCreate(0.6s) than B(32s);
> > 2) Matrix A spent less time on EPSSetup(0.015s) than B(0.9s)
> > 3) In debug version, matrix B distribute much more unbalancedly storage
> among processors(memory max/min 4365) than A(memory max/min 1.113), but
> other metrics seems more balanced. And in no-debug version there is no
> memory information output.
> >
> > The significant difference I can tell is :1) B use preallocation; 2) A's
> matrix elements are calculated by CPU, while B's matrix elements are
> calculated by GPU and then transfered to CPU and solved by PETSc in CPU.
> >
> > Does this is a normal result? I mean, the matrix with less non-zero
> elements and less dimension can cost more epssolve time? Is this due to the
> structure of matrix? IF so, is there any ways to increase the solve speed?
> >
> > Or this is weired and should  be fixed by some ways?
> > Thank you!
> >
> > Runfeng Jin
> >
> >
> > Jose E. Roman <jroman at dsic.upv.es> ?2022?6?12??? 16:08???
> > Please always respond to the list.
> >
> > Pay attention to the warnings in the log:
> >
> >       ##########################################################
> >       #                                                        #
> >       #                       WARNING!!!                       #
> >       #                                                        #
> >       #   This code was compiled with a debugging option.      #
> >       #   To get timing results run ./configure                #
> >       #   using --with-debugging=no, the performance will      #
> >       #   be generally two or three times faster.              #
> >       #                                                        #
> >       ##########################################################
> >
> > With the debugging option the times are not trustworthy, so I suggest
> repeating the analysis with an optimized build.
> >
> > Jose
> >
> >
> > > El 12 jun 2022, a las 5:41, Runfeng Jin <jsfaraway at gmail.com>
> escribi?:
> > >
> > > Hello!
> > >  I compare these two matrix solver's log view and find some strange
> thing. Attachment files are the log view.:
> > >    file 1:  log of matrix A solver. This is a larger
> matrix(900,000*900,000) but solved quickly(30s);
> > >    file 2: log of matix B solver. This is a smaller matrix(2,547*2,547
> , a little different from the matrix B that is mentioned in initial email,
> but solved much slower too. I use this for a quicker test) but solved much
> slower(1244s).
> > >
> > > By comparing these two files, I find some thing:
> > > 1) Matrix A has more basis vectors(375) than B(189), but A spent less
> time on BVCreate(0.349s) than B(296s);
> > > 2) Matrix A spent less time on EPSSetup(0.031s) than B(10.709s)
> > > 3) Matrix B distribute much more unbalancedly storage among
> processors(memory max/min 4365) than A(memory max/min 1.113), but other
> metrics seems more balanced.
> > >
> > > I don't do prealocation in A, and it is distributed across processors
> by PETSc. For B , when preallocation I use PetscSplitOwnership to decide
> which part belongs to local processor, and B is also distributed by PETSc
> when compute matrix values.
> > >
> > > - Does this mean, for matrix B, too much nonzero elements are stored
> in single process, and this is why it cost too much more time in solving
> the matrix and find eigenvalues? If so,  are there some better ways to
> distribute the matrix among processors?
> > > - Or are there any else reasons for this difference in cost time?
> > >
> > > Hope to recieve your reply, thank you!
> > >
> > > Runfeng Jin
> > >
> > >
> > >
> > > Runfeng Jin <jsfaraway at gmail.com> ?2022?6?11??? 20:33???
> > > Hello!
> > > I have try ues PETSC_DEFAULT for eps_ncv, but it still cost much time.
> Is there anything else I can do? Attachment is log when use PETSC_DEFAULT
> for eps_ncv.
> > >
> > > Thank you !
> > >
> > > Runfeng Jin
> > >
> > > Jose E. Roman <jroman at dsic.upv.es> ?2022?6?10??? 20:50???
> > > The value -eps_ncv 5000 is huge.
> > > Better let SLEPc use the default value.
> > >
> > > Jose
> > >
> > >
> > > > El 10 jun 2022, a las 14:24, Jin Runfeng <jsfaraway at gmail.com>
> escribi?:
> > > >
> > > > Hello!
> > > >  I want to acquire the 3 smallest eigenvalue, and attachment is the
> log  view output. I can see epssolve really cost the major time. But I can
> not see why it cost so much time. Can you see something from it?
> > > >
> > > > Thank you !
> > > >
> > > > Runfeng Jin
> > > >
> > > > On 6? 4 2022, at 1:37 ??, Jose E. Roman <jroman at dsic.upv.es> wrote:
> > > > Convergence depends on distribution of eigenvalues you want to
> compute. On the other hand, the cost also depends on the time it takes to
> build the preconditioner. Use -log_view to see the cost of the different
> steps of the computation.
> > > >
> > > > Jose
> > > >
> > > >
> > > > > El 3 jun 2022, a las 18:50, jsfaraway <jsfaraway at gmail.com>
> escribi?:
> > > > >
> > > > > hello!
> > > > >
> > > > > I am trying to use epsgd compute matrix's one smallest eigenvalue.
> And I find a strang thing. There are two matrix A(900000*900000) and
> B(90000*90000). While solve A use 371 iterations and only 30.83s, solve B
> use 22 iterations and 38885s! What could be the reason for this? Or what
> can I do to find the reason?
> > > > >
> > > > > I use" -eps_type gd -eps_ncv 300 -eps_nev 3 -eps_smallest_real ".
> > > > > And there is one difference I can tell is matrix B has many small
> value, whose absolute value is less than 10-6. Could this be the reason?
> > > > >
> > > > > Thank you!
> > > > >
> > > > > Runfeng Jin
> > > > <log_view.txt>
> > >
> > >
> <File2_lower-But-Smaller-Matrix.txt><File1_fatesr-But-Larger-MATRIX.txt>
> >
> > <file2_nodebug_MatrixB.txt><file1_nodebug_MatrixA.txt>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220615/1ceda25d/attachment-0001.html>

From knepley at gmail.com  Wed Jun 15 06:22:30 2022
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 15 Jun 2022 07:22:30 -0400
Subject: [petsc-users] SLEPc EPSGD: too much time in single iteration
In-Reply-To: <CAMhO6e01rKLBLy_L6GxHoTFRH1FM=y2bvSeRNrHvPiVW1S=ASg@mail.gmail.com>
References: <FD844D29-1CB7-4E1D-AEC6-80F3DAB9F670@dsic.upv.es>
	<EDBC719A-0974-446B-ADD2-0F101CDB691F@getmailspring.com>
	<0FE9EE1A-2CD4-44A2-ACD0-F6F4A466457A@dsic.upv.es>
	<CAMhO6e2GeOugEVtSyq1kJszr=CZ=3o7mhTPF-TMGo+rRcuc=vQ@mail.gmail.com>
	<CAMhO6e3Xy5iWVWTq-4=OAHn8KZKkzF_G24ad5F2TWqZXuJio0A@mail.gmail.com>
	<873EFD58-3EB4-4FF5-97B7-E2AE8A60D17B@dsic.upv.es>
	<CAMhO6e0Gfn6SAAweLYotGPuXV=yGh6v8VsMZ=mwB8MjCCAQ6VQ@mail.gmail.com>
	<CAMhO6e3E3JiKEFhzP_2sCH7DK-JjrFzXRCvn21yX4bJmm99fqA@mail.gmail.com>
	<14570A42-226E-49BB-9074-850258860ACC@dsic.upv.es>
	<CAMhO6e01rKLBLy_L6GxHoTFRH1FM=y2bvSeRNrHvPiVW1S=ASg@mail.gmail.com>
Message-ID: <CAMYG4GndrMyQhQLQ56dK0ENyXp8B4iwOLQJ2ZmHEcHG8dY3Z-g@mail.gmail.com>

On Wed, Jun 15, 2022 at 4:21 AM Runfeng Jin <jsfaraway at gmail.com> wrote:

> Hi!
> I use the same machine, same nodes and same processors per nodes. And I
> test many times, so this seems not an accidental result. But your points do
> inspire me. I use Global Array's communicator when solving matrix A, ang
> just MPI_COMM_WORLD in B. In every node, Global Array's communicator
> make one processor dedicated to  manage communicate, maybe this is the
> reason for the difference in communicating speed?
>
> I  will have a try and respond as soon as I get the result!
>

I would ask the sysadmin for that machine. That Barrier time is so high, I
would think something is wrong with the switch. Or you are
oversubscribing which is causing massive slowdown.

  Thanks,

     Matt


> Runfeng Jin
>
>
> Jose E. Roman <jroman at dsic.upv.es> ?2022?6?15??? 16:09???
>
>> You are comparing two different codes on two different machines? Or is it
>> the same machine? with different number of processes and different solver
>> options...
>>
>> If it is the same machine, the performance seems very different:
>>
>> Matrix A:
>> Average time for MPI_Barrier(): 1.90986e-05
>> Average time for zero size MPI_Send(): 3.44587e-06
>>
>> Matrix B:
>> Average time for MPI_Barrier(): 0.0578456
>> Average time for zero size MPI_Send(): 0.00358668
>>
>> The reductions (VecReduceComm) are taking 2.1629e-01 and 2.4972e+01,
>> respectively. It's a two orders of magnitude difference.
>>
>> Jose
>>
>>
>> > El 15 jun 2022, a las 8:58, Runfeng Jin <jsfaraway at gmail.com> escribi?:
>> >
>> > Sorry ,I miss the attachment.
>> >
>> > Runfeng Jin
>> >
>> > Runfeng Jin <jsfaraway at gmail.com> ?2022?6?15??? 14:56???
>> > Hi! You are right!  I try to use a SLEPc and PETSc version with
>> nodebug, and the matrix B's solver time become 99s. But It is still a
>> little higher than matrix A(8s). Same as mentioned before, attachment is
>> log view of no-debug version:
>> >    file 1:  log of matrix A solver. This is a larger
>> matrix(900,000*900,000) but solved quickly(8s);
>> >    file 2: log of matix B solver. This is a smaller matrix(2,547*2,547)
>> but solved much slower(99s).
>> >
>> > By comparing these two files,  the strang phenomenon still exist:
>> > 1) Matrix A has more basis vectors(375) than B(189), but A spent less
>> time on BVCreate(0.6s) than B(32s);
>> > 2) Matrix A spent less time on EPSSetup(0.015s) than B(0.9s)
>> > 3) In debug version, matrix B distribute much more unbalancedly storage
>> among processors(memory max/min 4365) than A(memory max/min 1.113), but
>> other metrics seems more balanced. And in no-debug version there is no
>> memory information output.
>> >
>> > The significant difference I can tell is :1) B use preallocation; 2)
>> A's matrix elements are calculated by CPU, while B's matrix elements are
>> calculated by GPU and then transfered to CPU and solved by PETSc in CPU.
>> >
>> > Does this is a normal result? I mean, the matrix with less non-zero
>> elements and less dimension can cost more epssolve time? Is this due to the
>> structure of matrix? IF so, is there any ways to increase the solve speed?
>> >
>> > Or this is weired and should  be fixed by some ways?
>> > Thank you!
>> >
>> > Runfeng Jin
>> >
>> >
>> > Jose E. Roman <jroman at dsic.upv.es> ?2022?6?12??? 16:08???
>> > Please always respond to the list.
>> >
>> > Pay attention to the warnings in the log:
>> >
>> >       ##########################################################
>> >       #                                                        #
>> >       #                       WARNING!!!                       #
>> >       #                                                        #
>> >       #   This code was compiled with a debugging option.      #
>> >       #   To get timing results run ./configure                #
>> >       #   using --with-debugging=no, the performance will      #
>> >       #   be generally two or three times faster.              #
>> >       #                                                        #
>> >       ##########################################################
>> >
>> > With the debugging option the times are not trustworthy, so I suggest
>> repeating the analysis with an optimized build.
>> >
>> > Jose
>> >
>> >
>> > > El 12 jun 2022, a las 5:41, Runfeng Jin <jsfaraway at gmail.com>
>> escribi?:
>> > >
>> > > Hello!
>> > >  I compare these two matrix solver's log view and find some strange
>> thing. Attachment files are the log view.:
>> > >    file 1:  log of matrix A solver. This is a larger
>> matrix(900,000*900,000) but solved quickly(30s);
>> > >    file 2: log of matix B solver. This is a smaller
>> matrix(2,547*2,547 , a little different from the matrix B that is mentioned
>> in initial email, but solved much slower too. I use this for a quicker
>> test) but solved much slower(1244s).
>> > >
>> > > By comparing these two files, I find some thing:
>> > > 1) Matrix A has more basis vectors(375) than B(189), but A spent less
>> time on BVCreate(0.349s) than B(296s);
>> > > 2) Matrix A spent less time on EPSSetup(0.031s) than B(10.709s)
>> > > 3) Matrix B distribute much more unbalancedly storage among
>> processors(memory max/min 4365) than A(memory max/min 1.113), but other
>> metrics seems more balanced.
>> > >
>> > > I don't do prealocation in A, and it is distributed across processors
>> by PETSc. For B , when preallocation I use PetscSplitOwnership to decide
>> which part belongs to local processor, and B is also distributed by PETSc
>> when compute matrix values.
>> > >
>> > > - Does this mean, for matrix B, too much nonzero elements are stored
>> in single process, and this is why it cost too much more time in solving
>> the matrix and find eigenvalues? If so,  are there some better ways to
>> distribute the matrix among processors?
>> > > - Or are there any else reasons for this difference in cost time?
>> > >
>> > > Hope to recieve your reply, thank you!
>> > >
>> > > Runfeng Jin
>> > >
>> > >
>> > >
>> > > Runfeng Jin <jsfaraway at gmail.com> ?2022?6?11??? 20:33???
>> > > Hello!
>> > > I have try ues PETSC_DEFAULT for eps_ncv, but it still cost much
>> time. Is there anything else I can do? Attachment is log when use
>> PETSC_DEFAULT for eps_ncv.
>> > >
>> > > Thank you !
>> > >
>> > > Runfeng Jin
>> > >
>> > > Jose E. Roman <jroman at dsic.upv.es> ?2022?6?10??? 20:50???
>> > > The value -eps_ncv 5000 is huge.
>> > > Better let SLEPc use the default value.
>> > >
>> > > Jose
>> > >
>> > >
>> > > > El 10 jun 2022, a las 14:24, Jin Runfeng <jsfaraway at gmail.com>
>> escribi?:
>> > > >
>> > > > Hello!
>> > > >  I want to acquire the 3 smallest eigenvalue, and attachment is the
>> log  view output. I can see epssolve really cost the major time. But I can
>> not see why it cost so much time. Can you see something from it?
>> > > >
>> > > > Thank you !
>> > > >
>> > > > Runfeng Jin
>> > > >
>> > > > On 6? 4 2022, at 1:37 ??, Jose E. Roman <jroman at dsic.upv.es> wrote:
>> > > > Convergence depends on distribution of eigenvalues you want to
>> compute. On the other hand, the cost also depends on the time it takes to
>> build the preconditioner. Use -log_view to see the cost of the different
>> steps of the computation.
>> > > >
>> > > > Jose
>> > > >
>> > > >
>> > > > > El 3 jun 2022, a las 18:50, jsfaraway <jsfaraway at gmail.com>
>> escribi?:
>> > > > >
>> > > > > hello!
>> > > > >
>> > > > > I am trying to use epsgd compute matrix's one smallest
>> eigenvalue. And I find a strang thing. There are two matrix
>> A(900000*900000) and B(90000*90000). While solve A use 371 iterations and
>> only 30.83s, solve B use 22 iterations and 38885s! What could be the reason
>> for this? Or what can I do to find the reason?
>> > > > >
>> > > > > I use" -eps_type gd -eps_ncv 300 -eps_nev 3 -eps_smallest_real ".
>> > > > > And there is one difference I can tell is matrix B has many small
>> value, whose absolute value is less than 10-6. Could this be the reason?
>> > > > >
>> > > > > Thank you!
>> > > > >
>> > > > > Runfeng Jin
>> > > > <log_view.txt>
>> > >
>> > >
>> <File2_lower-But-Smaller-Matrix.txt><File1_fatesr-But-Larger-MATRIX.txt>
>> >
>> > <file2_nodebug_MatrixB.txt><file1_nodebug_MatrixA.txt>
>>
>>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220615/93c48239/attachment.html>

From jsfaraway at gmail.com  Wed Jun 15 20:31:26 2022
From: jsfaraway at gmail.com (Runfeng Jin)
Date: Thu, 16 Jun 2022 09:31:26 +0800
Subject: [petsc-users] SLEPc EPSGD: too much time in single iteration
In-Reply-To: <CAMYG4GndrMyQhQLQ56dK0ENyXp8B4iwOLQJ2ZmHEcHG8dY3Z-g@mail.gmail.com>
References: <FD844D29-1CB7-4E1D-AEC6-80F3DAB9F670@dsic.upv.es>
	<EDBC719A-0974-446B-ADD2-0F101CDB691F@getmailspring.com>
	<0FE9EE1A-2CD4-44A2-ACD0-F6F4A466457A@dsic.upv.es>
	<CAMhO6e2GeOugEVtSyq1kJszr=CZ=3o7mhTPF-TMGo+rRcuc=vQ@mail.gmail.com>
	<CAMhO6e3Xy5iWVWTq-4=OAHn8KZKkzF_G24ad5F2TWqZXuJio0A@mail.gmail.com>
	<873EFD58-3EB4-4FF5-97B7-E2AE8A60D17B@dsic.upv.es>
	<CAMhO6e0Gfn6SAAweLYotGPuXV=yGh6v8VsMZ=mwB8MjCCAQ6VQ@mail.gmail.com>
	<CAMhO6e3E3JiKEFhzP_2sCH7DK-JjrFzXRCvn21yX4bJmm99fqA@mail.gmail.com>
	<14570A42-226E-49BB-9074-850258860ACC@dsic.upv.es>
	<CAMhO6e01rKLBLy_L6GxHoTFRH1FM=y2bvSeRNrHvPiVW1S=ASg@mail.gmail.com>
	<CAMYG4GndrMyQhQLQ56dK0ENyXp8B4iwOLQJ2ZmHEcHG8dY3Z-g@mail.gmail.com>
Message-ID: <CAMhO6e0w8UzaFhLcL4Y85RYT=9pYz1zyrr9nHLhE6+7ASQqABA@mail.gmail.com>

Hi! Thank you for your reply.

I am a little confused about the problem of machine. These two matrices
solved in the same cluster, if there are some problems about the machine,
why the low performance just happen to the matrix B?
 And, what is the situation of oversubscribing? Could you give some
examples?

Thank you!

Runfeng Jin

Matthew Knepley <knepley at gmail.com> ?2022?6?15??? 19:22???

> On Wed, Jun 15, 2022 at 4:21 AM Runfeng Jin <jsfaraway at gmail.com> wrote:
>
>> Hi!
>> I use the same machine, same nodes and same processors per nodes. And I
>> test many times, so this seems not an accidental result. But your points do
>> inspire me. I use Global Array's communicator when solving matrix A, ang
>> just MPI_COMM_WORLD in B. In every node, Global Array's communicator
>> make one processor dedicated to  manage communicate, maybe this is the
>> reason for the difference in communicating speed?
>>
>> I  will have a try and respond as soon as I get the result!
>>
>
> I would ask the sysadmin for that machine. That Barrier time is so high, I
> would think something is wrong with the switch. Or you are
> oversubscribing which is causing massive slowdown.
>
>   Thanks,
>
>      Matt
>
>
>> Runfeng Jin
>>
>>
>> Jose E. Roman <jroman at dsic.upv.es> ?2022?6?15??? 16:09???
>>
>>> You are comparing two different codes on two different machines? Or is
>>> it the same machine? with different number of processes and different
>>> solver options...
>>>
>>> If it is the same machine, the performance seems very different:
>>>
>>> Matrix A:
>>> Average time for MPI_Barrier(): 1.90986e-05
>>> Average time for zero size MPI_Send(): 3.44587e-06
>>>
>>> Matrix B:
>>> Average time for MPI_Barrier(): 0.0578456
>>> Average time for zero size MPI_Send(): 0.00358668
>>>
>>> The reductions (VecReduceComm) are taking 2.1629e-01 and 2.4972e+01,
>>> respectively. It's a two orders of magnitude difference.
>>>
>>> Jose
>>>
>>>
>>> > El 15 jun 2022, a las 8:58, Runfeng Jin <jsfaraway at gmail.com>
>>> escribi?:
>>> >
>>> > Sorry ,I miss the attachment.
>>> >
>>> > Runfeng Jin
>>> >
>>> > Runfeng Jin <jsfaraway at gmail.com> ?2022?6?15??? 14:56???
>>> > Hi! You are right!  I try to use a SLEPc and PETSc version with
>>> nodebug, and the matrix B's solver time become 99s. But It is still a
>>> little higher than matrix A(8s). Same as mentioned before, attachment is
>>> log view of no-debug version:
>>> >    file 1:  log of matrix A solver. This is a larger
>>> matrix(900,000*900,000) but solved quickly(8s);
>>> >    file 2: log of matix B solver. This is a smaller
>>> matrix(2,547*2,547) but solved much slower(99s).
>>> >
>>> > By comparing these two files,  the strang phenomenon still exist:
>>> > 1) Matrix A has more basis vectors(375) than B(189), but A spent less
>>> time on BVCreate(0.6s) than B(32s);
>>> > 2) Matrix A spent less time on EPSSetup(0.015s) than B(0.9s)
>>> > 3) In debug version, matrix B distribute much more unbalancedly
>>> storage among processors(memory max/min 4365) than A(memory max/min 1.113),
>>> but other metrics seems more balanced. And in no-debug version there is no
>>> memory information output.
>>> >
>>> > The significant difference I can tell is :1) B use preallocation; 2)
>>> A's matrix elements are calculated by CPU, while B's matrix elements are
>>> calculated by GPU and then transfered to CPU and solved by PETSc in CPU.
>>> >
>>> > Does this is a normal result? I mean, the matrix with less non-zero
>>> elements and less dimension can cost more epssolve time? Is this due to the
>>> structure of matrix? IF so, is there any ways to increase the solve speed?
>>> >
>>> > Or this is weired and should  be fixed by some ways?
>>> > Thank you!
>>> >
>>> > Runfeng Jin
>>> >
>>> >
>>> > Jose E. Roman <jroman at dsic.upv.es> ?2022?6?12??? 16:08???
>>> > Please always respond to the list.
>>> >
>>> > Pay attention to the warnings in the log:
>>> >
>>> >       ##########################################################
>>> >       #                                                        #
>>> >       #                       WARNING!!!                       #
>>> >       #                                                        #
>>> >       #   This code was compiled with a debugging option.      #
>>> >       #   To get timing results run ./configure                #
>>> >       #   using --with-debugging=no, the performance will      #
>>> >       #   be generally two or three times faster.              #
>>> >       #                                                        #
>>> >       ##########################################################
>>> >
>>> > With the debugging option the times are not trustworthy, so I suggest
>>> repeating the analysis with an optimized build.
>>> >
>>> > Jose
>>> >
>>> >
>>> > > El 12 jun 2022, a las 5:41, Runfeng Jin <jsfaraway at gmail.com>
>>> escribi?:
>>> > >
>>> > > Hello!
>>> > >  I compare these two matrix solver's log view and find some strange
>>> thing. Attachment files are the log view.:
>>> > >    file 1:  log of matrix A solver. This is a larger
>>> matrix(900,000*900,000) but solved quickly(30s);
>>> > >    file 2: log of matix B solver. This is a smaller
>>> matrix(2,547*2,547 , a little different from the matrix B that is mentioned
>>> in initial email, but solved much slower too. I use this for a quicker
>>> test) but solved much slower(1244s).
>>> > >
>>> > > By comparing these two files, I find some thing:
>>> > > 1) Matrix A has more basis vectors(375) than B(189), but A spent
>>> less time on BVCreate(0.349s) than B(296s);
>>> > > 2) Matrix A spent less time on EPSSetup(0.031s) than B(10.709s)
>>> > > 3) Matrix B distribute much more unbalancedly storage among
>>> processors(memory max/min 4365) than A(memory max/min 1.113), but other
>>> metrics seems more balanced.
>>> > >
>>> > > I don't do prealocation in A, and it is distributed across
>>> processors by PETSc. For B , when preallocation I use PetscSplitOwnership
>>> to decide which part belongs to local processor, and B is also distributed
>>> by PETSc when compute matrix values.
>>> > >
>>> > > - Does this mean, for matrix B, too much nonzero elements are stored
>>> in single process, and this is why it cost too much more time in solving
>>> the matrix and find eigenvalues? If so,  are there some better ways to
>>> distribute the matrix among processors?
>>> > > - Or are there any else reasons for this difference in cost time?
>>> > >
>>> > > Hope to recieve your reply, thank you!
>>> > >
>>> > > Runfeng Jin
>>> > >
>>> > >
>>> > >
>>> > > Runfeng Jin <jsfaraway at gmail.com> ?2022?6?11??? 20:33???
>>> > > Hello!
>>> > > I have try ues PETSC_DEFAULT for eps_ncv, but it still cost much
>>> time. Is there anything else I can do? Attachment is log when use
>>> PETSC_DEFAULT for eps_ncv.
>>> > >
>>> > > Thank you !
>>> > >
>>> > > Runfeng Jin
>>> > >
>>> > > Jose E. Roman <jroman at dsic.upv.es> ?2022?6?10??? 20:50???
>>> > > The value -eps_ncv 5000 is huge.
>>> > > Better let SLEPc use the default value.
>>> > >
>>> > > Jose
>>> > >
>>> > >
>>> > > > El 10 jun 2022, a las 14:24, Jin Runfeng <jsfaraway at gmail.com>
>>> escribi?:
>>> > > >
>>> > > > Hello!
>>> > > >  I want to acquire the 3 smallest eigenvalue, and attachment is
>>> the log  view output. I can see epssolve really cost the major time. But I
>>> can not see why it cost so much time. Can you see something from it?
>>> > > >
>>> > > > Thank you !
>>> > > >
>>> > > > Runfeng Jin
>>> > > >
>>> > > > On 6? 4 2022, at 1:37 ??, Jose E. Roman <jroman at dsic.upv.es>
>>> wrote:
>>> > > > Convergence depends on distribution of eigenvalues you want to
>>> compute. On the other hand, the cost also depends on the time it takes to
>>> build the preconditioner. Use -log_view to see the cost of the different
>>> steps of the computation.
>>> > > >
>>> > > > Jose
>>> > > >
>>> > > >
>>> > > > > El 3 jun 2022, a las 18:50, jsfaraway <jsfaraway at gmail.com>
>>> escribi?:
>>> > > > >
>>> > > > > hello!
>>> > > > >
>>> > > > > I am trying to use epsgd compute matrix's one smallest
>>> eigenvalue. And I find a strang thing. There are two matrix
>>> A(900000*900000) and B(90000*90000). While solve A use 371 iterations and
>>> only 30.83s, solve B use 22 iterations and 38885s! What could be the reason
>>> for this? Or what can I do to find the reason?
>>> > > > >
>>> > > > > I use" -eps_type gd -eps_ncv 300 -eps_nev 3 -eps_smallest_real ".
>>> > > > > And there is one difference I can tell is matrix B has many
>>> small value, whose absolute value is less than 10-6. Could this be the
>>> reason?
>>> > > > >
>>> > > > > Thank you!
>>> > > > >
>>> > > > > Runfeng Jin
>>> > > > <log_view.txt>
>>> > >
>>> > >
>>> <File2_lower-But-Smaller-Matrix.txt><File1_fatesr-But-Larger-MATRIX.txt>
>>> >
>>> > <file2_nodebug_MatrixB.txt><file1_nodebug_MatrixA.txt>
>>>
>>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220616/eac666a5/attachment.html>

From knepley at gmail.com  Thu Jun 16 07:12:24 2022
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 16 Jun 2022 08:12:24 -0400
Subject: [petsc-users] SLEPc EPSGD: too much time in single iteration
In-Reply-To: <CAMhO6e0w8UzaFhLcL4Y85RYT=9pYz1zyrr9nHLhE6+7ASQqABA@mail.gmail.com>
References: <FD844D29-1CB7-4E1D-AEC6-80F3DAB9F670@dsic.upv.es>
	<EDBC719A-0974-446B-ADD2-0F101CDB691F@getmailspring.com>
	<0FE9EE1A-2CD4-44A2-ACD0-F6F4A466457A@dsic.upv.es>
	<CAMhO6e2GeOugEVtSyq1kJszr=CZ=3o7mhTPF-TMGo+rRcuc=vQ@mail.gmail.com>
	<CAMhO6e3Xy5iWVWTq-4=OAHn8KZKkzF_G24ad5F2TWqZXuJio0A@mail.gmail.com>
	<873EFD58-3EB4-4FF5-97B7-E2AE8A60D17B@dsic.upv.es>
	<CAMhO6e0Gfn6SAAweLYotGPuXV=yGh6v8VsMZ=mwB8MjCCAQ6VQ@mail.gmail.com>
	<CAMhO6e3E3JiKEFhzP_2sCH7DK-JjrFzXRCvn21yX4bJmm99fqA@mail.gmail.com>
	<14570A42-226E-49BB-9074-850258860ACC@dsic.upv.es>
	<CAMhO6e01rKLBLy_L6GxHoTFRH1FM=y2bvSeRNrHvPiVW1S=ASg@mail.gmail.com>
	<CAMYG4GndrMyQhQLQ56dK0ENyXp8B4iwOLQJ2ZmHEcHG8dY3Z-g@mail.gmail.com>
	<CAMhO6e0w8UzaFhLcL4Y85RYT=9pYz1zyrr9nHLhE6+7ASQqABA@mail.gmail.com>
Message-ID: <CAMYG4GkgjOkF+hbrrAOy-gTEKAT2EeD_BfErFgOOJ58pS8qOSg@mail.gmail.com>

On Wed, Jun 15, 2022 at 9:32 PM Runfeng Jin <jsfaraway at gmail.com> wrote:

> Hi! Thank you for your reply.
>
> I am a little confused about the problem of machine. These two matrices
> solved in the same cluster, if there are some problems about the machine,
> why the low performance just happen to the matrix B?
>

The performance problem is not related to the matrix B. The MPI_Barrier
time on the second run is 1,000x slower. We just run MPI_Barrier() at
log output time to get this. It is not part of a solve.

It could be that there is a part of the cluster that is broken and your
second job got scheduled there.


>  And, what is the situation of oversubscribing? Could you give some
> examples?
>

Some MPI implementations perform extremely poorly when the number of
processes exceeds the number of cores. This is called oversubscription.

  Thanks,

     Matt


> Thank you!
>
> Runfeng Jin
>
> Matthew Knepley <knepley at gmail.com> ?2022?6?15??? 19:22???
>
>> On Wed, Jun 15, 2022 at 4:21 AM Runfeng Jin <jsfaraway at gmail.com> wrote:
>>
>>> Hi!
>>> I use the same machine, same nodes and same processors per nodes. And I
>>> test many times, so this seems not an accidental result. But your points do
>>> inspire me. I use Global Array's communicator when solving matrix A, ang
>>> just MPI_COMM_WORLD in B. In every node, Global Array's communicator
>>> make one processor dedicated to  manage communicate, maybe this is the
>>> reason for the difference in communicating speed?
>>>
>>> I  will have a try and respond as soon as I get the result!
>>>
>>
>> I would ask the sysadmin for that machine. That Barrier time is so high,
>> I would think something is wrong with the switch. Or you are
>> oversubscribing which is causing massive slowdown.
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> Runfeng Jin
>>>
>>>
>>> Jose E. Roman <jroman at dsic.upv.es> ?2022?6?15??? 16:09???
>>>
>>>> You are comparing two different codes on two different machines? Or is
>>>> it the same machine? with different number of processes and different
>>>> solver options...
>>>>
>>>> If it is the same machine, the performance seems very different:
>>>>
>>>> Matrix A:
>>>> Average time for MPI_Barrier(): 1.90986e-05
>>>> Average time for zero size MPI_Send(): 3.44587e-06
>>>>
>>>> Matrix B:
>>>> Average time for MPI_Barrier(): 0.0578456
>>>> Average time for zero size MPI_Send(): 0.00358668
>>>>
>>>> The reductions (VecReduceComm) are taking 2.1629e-01 and 2.4972e+01,
>>>> respectively. It's a two orders of magnitude difference.
>>>>
>>>> Jose
>>>>
>>>>
>>>> > El 15 jun 2022, a las 8:58, Runfeng Jin <jsfaraway at gmail.com>
>>>> escribi?:
>>>> >
>>>> > Sorry ,I miss the attachment.
>>>> >
>>>> > Runfeng Jin
>>>> >
>>>> > Runfeng Jin <jsfaraway at gmail.com> ?2022?6?15??? 14:56???
>>>> > Hi! You are right!  I try to use a SLEPc and PETSc version with
>>>> nodebug, and the matrix B's solver time become 99s. But It is still a
>>>> little higher than matrix A(8s). Same as mentioned before, attachment is
>>>> log view of no-debug version:
>>>> >    file 1:  log of matrix A solver. This is a larger
>>>> matrix(900,000*900,000) but solved quickly(8s);
>>>> >    file 2: log of matix B solver. This is a smaller
>>>> matrix(2,547*2,547) but solved much slower(99s).
>>>> >
>>>> > By comparing these two files,  the strang phenomenon still exist:
>>>> > 1) Matrix A has more basis vectors(375) than B(189), but A spent less
>>>> time on BVCreate(0.6s) than B(32s);
>>>> > 2) Matrix A spent less time on EPSSetup(0.015s) than B(0.9s)
>>>> > 3) In debug version, matrix B distribute much more unbalancedly
>>>> storage among processors(memory max/min 4365) than A(memory max/min 1.113),
>>>> but other metrics seems more balanced. And in no-debug version there is no
>>>> memory information output.
>>>> >
>>>> > The significant difference I can tell is :1) B use preallocation; 2)
>>>> A's matrix elements are calculated by CPU, while B's matrix elements are
>>>> calculated by GPU and then transfered to CPU and solved by PETSc in CPU.
>>>> >
>>>> > Does this is a normal result? I mean, the matrix with less non-zero
>>>> elements and less dimension can cost more epssolve time? Is this due to the
>>>> structure of matrix? IF so, is there any ways to increase the solve speed?
>>>> >
>>>> > Or this is weired and should  be fixed by some ways?
>>>> > Thank you!
>>>> >
>>>> > Runfeng Jin
>>>> >
>>>> >
>>>> > Jose E. Roman <jroman at dsic.upv.es> ?2022?6?12??? 16:08???
>>>> > Please always respond to the list.
>>>> >
>>>> > Pay attention to the warnings in the log:
>>>> >
>>>> >       ##########################################################
>>>> >       #                                                        #
>>>> >       #                       WARNING!!!                       #
>>>> >       #                                                        #
>>>> >       #   This code was compiled with a debugging option.      #
>>>> >       #   To get timing results run ./configure                #
>>>> >       #   using --with-debugging=no, the performance will      #
>>>> >       #   be generally two or three times faster.              #
>>>> >       #                                                        #
>>>> >       ##########################################################
>>>> >
>>>> > With the debugging option the times are not trustworthy, so I suggest
>>>> repeating the analysis with an optimized build.
>>>> >
>>>> > Jose
>>>> >
>>>> >
>>>> > > El 12 jun 2022, a las 5:41, Runfeng Jin <jsfaraway at gmail.com>
>>>> escribi?:
>>>> > >
>>>> > > Hello!
>>>> > >  I compare these two matrix solver's log view and find some strange
>>>> thing. Attachment files are the log view.:
>>>> > >    file 1:  log of matrix A solver. This is a larger
>>>> matrix(900,000*900,000) but solved quickly(30s);
>>>> > >    file 2: log of matix B solver. This is a smaller
>>>> matrix(2,547*2,547 , a little different from the matrix B that is mentioned
>>>> in initial email, but solved much slower too. I use this for a quicker
>>>> test) but solved much slower(1244s).
>>>> > >
>>>> > > By comparing these two files, I find some thing:
>>>> > > 1) Matrix A has more basis vectors(375) than B(189), but A spent
>>>> less time on BVCreate(0.349s) than B(296s);
>>>> > > 2) Matrix A spent less time on EPSSetup(0.031s) than B(10.709s)
>>>> > > 3) Matrix B distribute much more unbalancedly storage among
>>>> processors(memory max/min 4365) than A(memory max/min 1.113), but other
>>>> metrics seems more balanced.
>>>> > >
>>>> > > I don't do prealocation in A, and it is distributed across
>>>> processors by PETSc. For B , when preallocation I use PetscSplitOwnership
>>>> to decide which part belongs to local processor, and B is also distributed
>>>> by PETSc when compute matrix values.
>>>> > >
>>>> > > - Does this mean, for matrix B, too much nonzero elements are
>>>> stored in single process, and this is why it cost too much more time in
>>>> solving the matrix and find eigenvalues? If so,  are there some better ways
>>>> to distribute the matrix among processors?
>>>> > > - Or are there any else reasons for this difference in cost time?
>>>> > >
>>>> > > Hope to recieve your reply, thank you!
>>>> > >
>>>> > > Runfeng Jin
>>>> > >
>>>> > >
>>>> > >
>>>> > > Runfeng Jin <jsfaraway at gmail.com> ?2022?6?11??? 20:33???
>>>> > > Hello!
>>>> > > I have try ues PETSC_DEFAULT for eps_ncv, but it still cost much
>>>> time. Is there anything else I can do? Attachment is log when use
>>>> PETSC_DEFAULT for eps_ncv.
>>>> > >
>>>> > > Thank you !
>>>> > >
>>>> > > Runfeng Jin
>>>> > >
>>>> > > Jose E. Roman <jroman at dsic.upv.es> ?2022?6?10??? 20:50???
>>>> > > The value -eps_ncv 5000 is huge.
>>>> > > Better let SLEPc use the default value.
>>>> > >
>>>> > > Jose
>>>> > >
>>>> > >
>>>> > > > El 10 jun 2022, a las 14:24, Jin Runfeng <jsfaraway at gmail.com>
>>>> escribi?:
>>>> > > >
>>>> > > > Hello!
>>>> > > >  I want to acquire the 3 smallest eigenvalue, and attachment is
>>>> the log  view output. I can see epssolve really cost the major time. But I
>>>> can not see why it cost so much time. Can you see something from it?
>>>> > > >
>>>> > > > Thank you !
>>>> > > >
>>>> > > > Runfeng Jin
>>>> > > >
>>>> > > > On 6? 4 2022, at 1:37 ??, Jose E. Roman <jroman at dsic.upv.es>
>>>> wrote:
>>>> > > > Convergence depends on distribution of eigenvalues you want to
>>>> compute. On the other hand, the cost also depends on the time it takes to
>>>> build the preconditioner. Use -log_view to see the cost of the different
>>>> steps of the computation.
>>>> > > >
>>>> > > > Jose
>>>> > > >
>>>> > > >
>>>> > > > > El 3 jun 2022, a las 18:50, jsfaraway <jsfaraway at gmail.com>
>>>> escribi?:
>>>> > > > >
>>>> > > > > hello!
>>>> > > > >
>>>> > > > > I am trying to use epsgd compute matrix's one smallest
>>>> eigenvalue. And I find a strang thing. There are two matrix
>>>> A(900000*900000) and B(90000*90000). While solve A use 371 iterations and
>>>> only 30.83s, solve B use 22 iterations and 38885s! What could be the reason
>>>> for this? Or what can I do to find the reason?
>>>> > > > >
>>>> > > > > I use" -eps_type gd -eps_ncv 300 -eps_nev 3 -eps_smallest_real
>>>> ".
>>>> > > > > And there is one difference I can tell is matrix B has many
>>>> small value, whose absolute value is less than 10-6. Could this be the
>>>> reason?
>>>> > > > >
>>>> > > > > Thank you!
>>>> > > > >
>>>> > > > > Runfeng Jin
>>>> > > > <log_view.txt>
>>>> > >
>>>> > >
>>>> <File2_lower-But-Smaller-Matrix.txt><File1_fatesr-But-Larger-MATRIX.txt>
>>>> >
>>>> > <file2_nodebug_MatrixB.txt><file1_nodebug_MatrixA.txt>
>>>>
>>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220616/1ee21023/attachment-0001.html>

From yangzongze at gmail.com  Thu Jun 16 10:11:32 2022
From: yangzongze at gmail.com (Zongze Yang)
Date: Thu, 16 Jun 2022 23:11:32 +0800
Subject: [petsc-users] How to find the map between the high order
 coordinates of DMPlex and vertex numbering?
Message-ID: <CA+K_gXDBe7r7p4g+eoX541PP2GSLrcyr2Zyik=mzg-L_GZTS6g@mail.gmail.com>

Hi, if I load a `gmsh` file with second-order elements, the coordinates
will be stored in a DG-P2 space. After obtaining the coordinates of a cell,
how can I map the coordinates to vertex and edge?

Below is some code load the gmsh file, I want to know the relation between
`cl` and `cell_coords`.

```
import firedrake as fd
import numpy as np

# Load gmsh file (2rd)
plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh')

cs, ce = plex.getHeightStratum(0)

cdm = plex.getCoordinateDM()
csec = dm.getCoordinateSection()
coords_gvec = dm.getCoordinates()

for i in range(cs, ce):
    cell_coords = cdm.getVecClosure(csec, coords_gvec, i)
    print(f'coordinates for cell {i} :\n{cell_coords.reshape([-1, 3])}')
    cl = dm.getTransitiveClosure(i)
    print('closure:', cl)
    break
```

Best wishes,
Zongze
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220616/21fb60a9/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test-fd-load-p2-rect.msh
Type: application/octet-stream
Size: 189254 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220616/21fb60a9/attachment-0001.obj>

From knepley at gmail.com  Thu Jun 16 10:22:08 2022
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 16 Jun 2022 11:22:08 -0400
Subject: [petsc-users] How to find the map between the high order
 coordinates of DMPlex and vertex numbering?
In-Reply-To: <CA+K_gXDBe7r7p4g+eoX541PP2GSLrcyr2Zyik=mzg-L_GZTS6g@mail.gmail.com>
References: <CA+K_gXDBe7r7p4g+eoX541PP2GSLrcyr2Zyik=mzg-L_GZTS6g@mail.gmail.com>
Message-ID: <CAMYG4Gkd7n7p6Ph+w8kN0DOxRFaxBr1sm6PFc15CP+5BT4uGww@mail.gmail.com>

On Thu, Jun 16, 2022 at 11:11 AM Zongze Yang <yangzongze at gmail.com> wrote:

> Hi, if I load a `gmsh` file with second-order elements, the coordinates
> will be stored in a DG-P2 space. After obtaining the coordinates of a cell,
> how can I map the coordinates to vertex and edge?
>

By default, they are stored as P2, not DG.

You can ask for the coordinates of a vertex or an edge directly using

  https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexPointLocalRead/

by giving the vertex or edge point. You can get all the coordinates on a
cell, in the closure order, using

  https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexVecGetClosure/

  Thanks,

     Matt


> Below is some code load the gmsh file, I want to know the relation between
> `cl` and `cell_coords`.
>
> ```
> import firedrake as fd
> import numpy as np
>
> # Load gmsh file (2rd)
> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh')
>
> cs, ce = plex.getHeightStratum(0)
>
> cdm = plex.getCoordinateDM()
> csec = dm.getCoordinateSection()
> coords_gvec = dm.getCoordinates()
>
> for i in range(cs, ce):
>     cell_coords = cdm.getVecClosure(csec, coords_gvec, i)
>     print(f'coordinates for cell {i} :\n{cell_coords.reshape([-1, 3])}')
>     cl = dm.getTransitiveClosure(i)
>     print('closure:', cl)
>     break
> ```
>
> Best wishes,
> Zongze
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220616/ca7147f6/attachment.html>

From yangzongze at gmail.com  Thu Jun 16 11:06:22 2022
From: yangzongze at gmail.com (Zongze Yang)
Date: Fri, 17 Jun 2022 00:06:22 +0800
Subject: [petsc-users] How to find the map between the high order
 coordinates of DMPlex and vertex numbering?
In-Reply-To: <CAMYG4Gkd7n7p6Ph+w8kN0DOxRFaxBr1sm6PFc15CP+5BT4uGww@mail.gmail.com>
References: <CAMYG4Gkd7n7p6Ph+w8kN0DOxRFaxBr1sm6PFc15CP+5BT4uGww@mail.gmail.com>
Message-ID: <2640A1A9-101C-4DFB-BFA4-C64AF231732A@gmail.com>



> ? 2022?6?16??23:22?Matthew Knepley <knepley at gmail.com> ???
> 
> ?
>> On Thu, Jun 16, 2022 at 11:11 AM Zongze Yang <yangzongze at gmail.com> wrote:
> 
>> Hi, if I load a `gmsh` file with second-order elements, the coordinates will be stored in a DG-P2 space. After obtaining the coordinates of a cell, how can I map the coordinates to vertex and edge? 
> 
> By default, they are stored as P2, not DG.

I checked the coordinates vector, and found the dogs only defined on cell other than vertex and edge, so I said they are stored as DG.
Then the function DMPlexVecGetClosure seems return the coordinates in lex order.

Some code in reading gmsh file reads that


1756:     if (isSimplex) continuity = PETSC_FALSE; /* XXX FIXME Requires DMPlexSetClosurePermutationLexicographic() */

1758:     GmshCreateFE(comm, NULL, isSimplex, continuity, nodeType, dim, coordDim, order, &fe)

The continuity is set to false for simplex.

Thanks,
Zongze



> 
> You can ask for the coordinates of a vertex or an edge directly using
> 
>   https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexPointLocalRead/
> 
> by giving the vertex or edge point. You can get all the coordinates on a cell, in the closure order, using
> 
>   https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexVecGetClosure/
>   Thanks,
> 
>      Matt
>  
>> Below is some code load the gmsh file, I want to know the relation between `cl` and `cell_coords`.
>> 
>> ```
>> import firedrake as fd
>> import numpy as np
>> 
>> # Load gmsh file (2rd)
>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh')
>> 
>> cs, ce = plex.getHeightStratum(0)
>> 
>> cdm = plex.getCoordinateDM()
>> csec = dm.getCoordinateSection()
>> coords_gvec = dm.getCoordinates()
>> 
>> for i in range(cs, ce):
>>     cell_coords = cdm.getVecClosure(csec, coords_gvec, i)
>>     print(f'coordinates for cell {i} :\n{cell_coords.reshape([-1, 3])}')
>>     cl = dm.getTransitiveClosure(i)
>>     print('closure:', cl)
>>     break
>> ```
>> 
>> Best wishes,
>> Zongze
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://www.cse.buffalo.edu/~knepley/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220617/e05995ba/attachment.html>

From knepley at gmail.com  Thu Jun 16 12:11:26 2022
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 16 Jun 2022 13:11:26 -0400
Subject: [petsc-users] How to find the map between the high order
 coordinates of DMPlex and vertex numbering?
In-Reply-To: <2640A1A9-101C-4DFB-BFA4-C64AF231732A@gmail.com>
References: <CAMYG4Gkd7n7p6Ph+w8kN0DOxRFaxBr1sm6PFc15CP+5BT4uGww@mail.gmail.com>
	<2640A1A9-101C-4DFB-BFA4-C64AF231732A@gmail.com>
Message-ID: <CAMYG4GkV1gCOL=zMKv+8FP7DXh95TLO234Du-mozycfWAVTvhA@mail.gmail.com>

On Thu, Jun 16, 2022 at 12:06 PM Zongze Yang <yangzongze at gmail.com> wrote:

>
>
> ? 2022?6?16??23:22?Matthew Knepley <knepley at gmail.com> ???
>
> ?
> On Thu, Jun 16, 2022 at 11:11 AM Zongze Yang <yangzongze at gmail.com> wrote:
>
>> Hi, if I load a `gmsh` file with second-order elements, the coordinates
>> will be stored in a DG-P2 space. After obtaining the coordinates of a cell,
>> how can I map the coordinates to vertex and edge?
>>
>
> By default, they are stored as P2, not DG.
>
>
> I checked the coordinates vector, and found the dogs only defined on cell
> other than vertex and edge, so I said they are stored as DG.
> Then the function DMPlexVecGetClosure
> <https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexVecGetClosure/> seems return
> the coordinates in lex order.
>
> Some code in reading gmsh file reads that
>
>
> 1756:     if (isSimplex) continuity = PETSC_FALSE
> <https://petsc.org/main/docs/manualpages/Sys/PETSC_FALSE/>; /* XXX FIXME
> Requires DMPlexSetClosurePermutationLexicographic() */
>
>
> 1758:     GmshCreateFE(comm, NULL, isSimplex, continuity, nodeType, dim,
> coordDim, order, &fe)
>
>
> The continuity is set to false for simplex.
>

Oh, yes. That needs to be fixed. For now, you can just project it to P2 if
you want using

  https://petsc.org/main/docs/manualpages/DM/DMProjectCoordinates/

  Thanks,

     Matt


> Thanks,
> Zongze
>
> You can ask for the coordinates of a vertex or an edge directly using
>
>   https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexPointLocalRead/
>
> by giving the vertex or edge point. You can get all the coordinates on a
> cell, in the closure order, using
>
>   https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexVecGetClosure/
>
>   Thanks,
>
>      Matt
>
>
>> Below is some code load the gmsh file, I want to know the relation
>> between `cl` and `cell_coords`.
>>
>> ```
>> import firedrake as fd
>> import numpy as np
>>
>> # Load gmsh file (2rd)
>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh')
>>
>> cs, ce = plex.getHeightStratum(0)
>>
>> cdm = plex.getCoordinateDM()
>> csec = dm.getCoordinateSection()
>> coords_gvec = dm.getCoordinates()
>>
>> for i in range(cs, ce):
>>     cell_coords = cdm.getVecClosure(csec, coords_gvec, i)
>>     print(f'coordinates for cell {i} :\n{cell_coords.reshape([-1, 3])}')
>>     cl = dm.getTransitiveClosure(i)
>>     print('closure:', cl)
>>     break
>> ```
>>
>> Best wishes,
>> Zongze
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220616/a580a817/attachment.html>

From tt73 at njit.edu  Thu Jun 16 16:57:23 2022
From: tt73 at njit.edu (tt73)
Date: Thu, 16 Jun 2022 17:57:23 -0400
Subject: [petsc-users] Customizing NASM subsnes
Message-ID: <62aba746.1c69fb81.7df46.678d@mx.google.com>

Hi,?I am using? NASM as the outer solver for a nonlinear problem. For one of the subdomains, I want to run the local solve with a different set of options form the others. Is there any way to set options for each subdomain??
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220616/a235a428/attachment.html>

From knepley at gmail.com  Thu Jun 16 17:23:42 2022
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 16 Jun 2022 18:23:42 -0400
Subject: [petsc-users] Customizing NASM subsnes
In-Reply-To: <62aba746.1c69fb81.7df46.678d@mx.google.com>
References: <62aba746.1c69fb81.7df46.678d@mx.google.com>
Message-ID: <CAMYG4Gm7+vpJZ9dSDaivKYStsAA8zoZr1-gWKWh7ROKXsFWWRw@mail.gmail.com>

On Thu, Jun 16, 2022 at 5:57 PM tt73 <tt73 at njit.edu> wrote:

>
> Hi,
>
> I am using  NASM as the outer solver for a nonlinear problem. For one of
> the subdomains, I want to run the local solve with a different set of
> options form the others. Is there any way to set options for each
> subdomain?
>

I can see two ways:

  1) Pull out the subsolver and set it using the API

  2) Pull out the subsolver and give it a different prefix

  Thanks,

     Matt

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220616/8f981f2d/attachment.html>

From yangzongze at gmail.com  Fri Jun 17 01:54:03 2022
From: yangzongze at gmail.com (Zongze Yang)
Date: Fri, 17 Jun 2022 14:54:03 +0800
Subject: [petsc-users] How to find the map between the high order
 coordinates of DMPlex and vertex numbering?
In-Reply-To: <CAMYG4GkV1gCOL=zMKv+8FP7DXh95TLO234Du-mozycfWAVTvhA@mail.gmail.com>
References: <CAMYG4Gkd7n7p6Ph+w8kN0DOxRFaxBr1sm6PFc15CP+5BT4uGww@mail.gmail.com>
	<2640A1A9-101C-4DFB-BFA4-C64AF231732A@gmail.com>
	<CAMYG4GkV1gCOL=zMKv+8FP7DXh95TLO234Du-mozycfWAVTvhA@mail.gmail.com>
Message-ID: <CA+K_gXABGuMWmHrGA-Er4OH2FzNWHom7=0nPgBQJkbpo3cShdw@mail.gmail.com>

I tried the projection operation. However, it seems that the projection
gives the wrong solution. After projection, the bounding box is changed!
See logs below.

First, I patch the petsc4py by adding `DMProjectCoordinates`:
```
diff --git a/src/binding/petsc4py/src/PETSc/DM.pyx
b/src/binding/petsc4py/src/PETSc/DM.pyx
index d8a58d183a..dbcdb280f1 100644
--- a/src/binding/petsc4py/src/PETSc/DM.pyx
+++ b/src/binding/petsc4py/src/PETSc/DM.pyx
@@ -307,6 +307,12 @@ cdef class DM(Object):
         PetscINCREF(c.obj)
         return c

+    def projectCoordinates(self, FE fe=None):
+        if fe is None:
+            CHKERR( DMProjectCoordinates(self.dm, NULL) )
+        else:
+            CHKERR( DMProjectCoordinates(self.dm, fe.fe) )
+
     def getBoundingBox(self):
         cdef PetscInt i,dim=0
         CHKERR( DMGetCoordinateDim(self.dm, &dim) )
diff --git a/src/binding/petsc4py/src/PETSc/petscdm.pxi
b/src/binding/petsc4py/src/PETSc/petscdm.pxi
index 514b6fa472..c778e39884 100644
--- a/src/binding/petsc4py/src/PETSc/petscdm.pxi
+++ b/src/binding/petsc4py/src/PETSc/petscdm.pxi
@@ -90,6 +90,7 @@ cdef extern from * nogil:
     int DMGetCoordinateDim(PetscDM,PetscInt*)
     int DMSetCoordinateDim(PetscDM,PetscInt)
     int DMLocalizeCoordinates(PetscDM)
+    int DMProjectCoordinates(PetscDM, PetscFE)

     int DMCreateInterpolation(PetscDM,PetscDM,PetscMat*,PetscVec*)
     int DMCreateInjection(PetscDM,PetscDM,PetscMat*)
```

Then in python, I load a mesh and project the coordinates to P2:
```
import firedrake as fd
from firedrake.petsc import PETSc

# plex = fd.mesh._from_gmsh('test-fd-load-p2.msh')
plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh')
print('old bbox:', plex.getBoundingBox())

dim = plex.getDimension()
#                                 (dim,  nc, isSimplex, k,          qorder,
comm=None)
fe_new = PETSc.FE().createLagrange(dim, dim,      True, 2, PETSc.DETERMINE)
plex.projectCoordinates(fe_new)
fe_new.view()

print('new bbox:', plex.getBoundingBox())
```

The output is (The bounding box is changed!)
```

old bbox: ((0.0, 1.0), (0.0, 1.0), (0.0, 1.0))
PetscFE Object: P2 1 MPI processes
  type: basic
  Basic Finite Element in 3 dimensions with 3 components
  PetscSpace Object: P2 1 MPI processes
    type: sum
    Space in 3 variables with 3 components, size 30
    Sum space of 3 concatenated subspaces (all identical)
      PetscSpace Object: sum component (sumcomp_) 1 MPI processes
        type: poly
        Space in 3 variables with 1 components, size 10
        Polynomial space of degree 2
  PetscDualSpace Object: P2 1 MPI processes
    type: lagrange
    Dual space with 3 components, size 30
    Continuous Lagrange dual space
    Quadrature of order 5 on 27 points (dim 3)
new bbox: ((-6.530133708576188e-17, 36.30670832662781),
(-3.899962995254311e-17, 36.2406171632539), (-8.8036464152166e-17,
36.111577025012224))

```


By the way, for the original DG coordinates, where can I find the
relation of the closure and the order of the dofs for the cell?


Thanks!


  Zongze



Matthew Knepley <knepley at gmail.com> ?2022?6?17??? 01:11???

> On Thu, Jun 16, 2022 at 12:06 PM Zongze Yang <yangzongze at gmail.com> wrote:
>
>>
>>
>> ? 2022?6?16??23:22?Matthew Knepley <knepley at gmail.com> ???
>>
>> ?
>> On Thu, Jun 16, 2022 at 11:11 AM Zongze Yang <yangzongze at gmail.com>
>> wrote:
>>
>>> Hi, if I load a `gmsh` file with second-order elements, the coordinates
>>> will be stored in a DG-P2 space. After obtaining the coordinates of a cell,
>>> how can I map the coordinates to vertex and edge?
>>>
>>
>> By default, they are stored as P2, not DG.
>>
>>
>> I checked the coordinates vector, and found the dogs only defined on cell
>> other than vertex and edge, so I said they are stored as DG.
>> Then the function DMPlexVecGetClosure
>> <https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexVecGetClosure/> seems return
>> the coordinates in lex order.
>>
>> Some code in reading gmsh file reads that
>>
>>
>> 1756:     if (isSimplex) continuity = PETSC_FALSE
>> <https://petsc.org/main/docs/manualpages/Sys/PETSC_FALSE/>; /* XXX FIXME
>> Requires DMPlexSetClosurePermutationLexicographic() */
>>
>>
>> 1758:     GmshCreateFE(comm, NULL, isSimplex, continuity, nodeType, dim,
>> coordDim, order, &fe)
>>
>>
>> The continuity is set to false for simplex.
>>
>
> Oh, yes. That needs to be fixed. For now, you can just project it to P2 if
> you want using
>
>   https://petsc.org/main/docs/manualpages/DM/DMProjectCoordinates/
>
>   Thanks,
>
>      Matt
>
>
>> Thanks,
>> Zongze
>>
>> You can ask for the coordinates of a vertex or an edge directly using
>>
>>   https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexPointLocalRead/
>>
>> by giving the vertex or edge point. You can get all the coordinates on a
>> cell, in the closure order, using
>>
>>   https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexVecGetClosure/
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> Below is some code load the gmsh file, I want to know the relation
>>> between `cl` and `cell_coords`.
>>>
>>> ```
>>> import firedrake as fd
>>> import numpy as np
>>>
>>> # Load gmsh file (2rd)
>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh')
>>>
>>> cs, ce = plex.getHeightStratum(0)
>>>
>>> cdm = plex.getCoordinateDM()
>>> csec = dm.getCoordinateSection()
>>> coords_gvec = dm.getCoordinates()
>>>
>>> for i in range(cs, ce):
>>>     cell_coords = cdm.getVecClosure(csec, coords_gvec, i)
>>>     print(f'coordinates for cell {i} :\n{cell_coords.reshape([-1, 3])}')
>>>     cl = dm.getTransitiveClosure(i)
>>>     print('closure:', cl)
>>>     break
>>> ```
>>>
>>> Best wishes,
>>> Zongze
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220617/8d6b6b11/attachment-0001.html>

From tt73 at njit.edu  Fri Jun 17 08:22:37 2022
From: tt73 at njit.edu (Takahashi, Tadanaga)
Date: Fri, 17 Jun 2022 09:22:37 -0400
Subject: [petsc-users] Customizing NASM subsnes
In-Reply-To: <CAMYG4Gm7+vpJZ9dSDaivKYStsAA8zoZr1-gWKWh7ROKXsFWWRw@mail.gmail.com>
References: <62aba746.1c69fb81.7df46.678d@mx.google.com>
	<CAMYG4Gm7+vpJZ9dSDaivKYStsAA8zoZr1-gWKWh7ROKXsFWWRw@mail.gmail.com>
Message-ID: <CAJOF-fn_QKBLCEeCACY9_UomBBJqms5WscBCoeinCzU+yXrkEQ@mail.gmail.com>

I'm having some trouble pulling out the subsolver. I tried to use
SNESNASMGetSNES in a loop over each subdomain. However I get an error when
I run the code with more than one MPI processors. Here is a snippet from my
code:

   SNES           snes, subsnes;
   PetscMPIInt    rank, size;
   ...
   ierr = SNESCreate(PETSC_COMM_WORLD,&snes); CHKERRQ(ierr);
   ierr = SNESSetType(snes,SNESNASM); CHKERRQ(ierr);
   ierr = SNESNASMSetType(snes,PC_ASM_RESTRICT); CHKERRQ(ierr);
   ...
   ierr = SNESSetFromOptions(snes); CHKERRQ(ierr);
   ierr = SNESSetUp(snes); CHKERRQ(ierr);
   PetscPrintf(PETSC_COMM_WORLD, "Size = %d\n",size);
   for (i=0; i<size; i++) {
      PetscPrintf(PETSC_COMM_WORLD, "rank = %d\n",i);
      SNESNASMGetSNES(snes,i,&subsnes);
      // char prefix[10];
      // sprintf(prefix,"sub_%d_",i);
      // SNESSetOptionsPrefix(subsnes,prefix);
   }
   ...
   ierr = SNESSolve(snes,NULL,u_initial); CHKERRQ(ierr);


And, here is the output of the code when I run with 2 MPI procs:

takahashi at ubuntu:~/Desktop/MA-DDM/C/Rectangle$ mpiexec -n 2 ./test1
Size = 2
rank = 0
rank = 1
[0]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
[0]PETSC ERROR: Argument out of range
[0]PETSC ERROR: No such subsolver
[0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.17.1, unknown
[0]PETSC ERROR: ./test1 on a linux-gnu-c-debug named ubuntu by takahashi
Fri Jun 17 06:06:38 2022
[0]PETSC ERROR: Configure options --with-mpi-dir=/usr --with-fc=0
[0]PETSC ERROR: #1 SNESNASMGetSNES() at
/home/takahashi/Desktop/petsc/src/snes/impls/nasm/nasm.c:923

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 0 PID 976566 RUNNING AT ubuntu
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

This error doesn't occur when I run this without MPI. However, I tried to
change the prefix of the subdomain to `sub_0_` but I am not able to change
the snes_type using this prefix. Running ./test1 -snes_view -help | grep
sub_0_snes_type prints nothing.

On Thu, Jun 16, 2022 at 6:23 PM Matthew Knepley <knepley at gmail.com> wrote:

> On Thu, Jun 16, 2022 at 5:57 PM tt73 <tt73 at njit.edu> wrote:
>
>>
>> Hi,
>>
>> I am using  NASM as the outer solver for a nonlinear problem. For one of
>> the subdomains, I want to run the local solve with a different set of
>> options form the others. Is there any way to set options for each
>> subdomain?
>>
>
> I can see two ways:
>
>   1) Pull out the subsolver and set it using the API
>
>   2) Pull out the subsolver and give it a different prefix
>
>   Thanks,
>
>      Matt
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220617/f3d66e11/attachment.html>

From knepley at gmail.com  Fri Jun 17 08:35:09 2022
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 17 Jun 2022 09:35:09 -0400
Subject: [petsc-users] Customizing NASM subsnes
In-Reply-To: <CAJOF-fn_QKBLCEeCACY9_UomBBJqms5WscBCoeinCzU+yXrkEQ@mail.gmail.com>
References: <62aba746.1c69fb81.7df46.678d@mx.google.com>
	<CAMYG4Gm7+vpJZ9dSDaivKYStsAA8zoZr1-gWKWh7ROKXsFWWRw@mail.gmail.com>
	<CAJOF-fn_QKBLCEeCACY9_UomBBJqms5WscBCoeinCzU+yXrkEQ@mail.gmail.com>
Message-ID: <CAMYG4Gk7pBSSSZU3Y6Cx4Rncr9NVeh_vK=9z8j98yaW7uABYKA@mail.gmail.com>

On Fri, Jun 17, 2022 at 9:22 AM Takahashi, Tadanaga <tt73 at njit.edu> wrote:

> I'm having some trouble pulling out the subsolver. I tried to use
> SNESNASMGetSNES in a loop over each subdomain. However I get an error when
> I run the code with more than one MPI processors. Here is a snippet from my
> code:
>
>    SNES           snes, subsnes;
>    PetscMPIInt    rank, size;
>    ...
>    ierr = SNESCreate(PETSC_COMM_WORLD,&snes); CHKERRQ(ierr);
>    ierr = SNESSetType(snes,SNESNASM); CHKERRQ(ierr);
>    ierr = SNESNASMSetType(snes,PC_ASM_RESTRICT); CHKERRQ(ierr);
>    ...
>    ierr = SNESSetFromOptions(snes); CHKERRQ(ierr);
>    ierr = SNESSetUp(snes); CHKERRQ(ierr);
>    PetscPrintf(PETSC_COMM_WORLD, "Size = %d\n",size);
>    for (i=0; i<size; i++) {
>       PetscPrintf(PETSC_COMM_WORLD, "rank = %d\n",i);
>       SNESNASMGetSNES(snes,i,&subsnes);
>       // char prefix[10];
>       // sprintf(prefix,"sub_%d_",i);
>       // SNESSetOptionsPrefix(subsnes,prefix);
>    }
>    ...
>    ierr = SNESSolve(snes,NULL,u_initial); CHKERRQ(ierr);
>
>
> And, here is the output of the code when I run with 2 MPI procs:
>

SNESNASMGetSNES() gets the local subsolvers. It seems you only have one per
process.
You can check
https://petsc.org/main/docs/manualpages/SNES/SNESNASMGetNumber/

Notice that your current code will not work because, according to your
explanation, you only want to change
the prefix on a single rank, so you need to check the rank when you do it.

  Thanks,

     Matt


> takahashi at ubuntu:~/Desktop/MA-DDM/C/Rectangle$ mpiexec -n 2 ./test1
> Size = 2
> rank = 0
> rank = 1
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [0]PETSC ERROR: Argument out of range
> [0]PETSC ERROR: No such subsolver
> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.17.1, unknown
> [0]PETSC ERROR: ./test1 on a linux-gnu-c-debug named ubuntu by takahashi
> Fri Jun 17 06:06:38 2022
> [0]PETSC ERROR: Configure options --with-mpi-dir=/usr --with-fc=0
> [0]PETSC ERROR: #1 SNESNASMGetSNES() at
> /home/takahashi/Desktop/petsc/src/snes/impls/nasm/nasm.c:923
>
>
> ===================================================================================
> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> =   RANK 0 PID 976566 RUNNING AT ubuntu
> =   KILLED BY SIGNAL: 9 (Killed)
>
> ===================================================================================
>
> This error doesn't occur when I run this without MPI. However, I tried to
> change the prefix of the subdomain to `sub_0_` but I am not able to change
> the snes_type using this prefix. Running ./test1 -snes_view -help | grep
> sub_0_snes_type prints nothing.
>
> On Thu, Jun 16, 2022 at 6:23 PM Matthew Knepley <knepley at gmail.com> wrote:
>
>> On Thu, Jun 16, 2022 at 5:57 PM tt73 <tt73 at njit.edu> wrote:
>>
>>>
>>> Hi,
>>>
>>> I am using  NASM as the outer solver for a nonlinear problem. For one of
>>> the subdomains, I want to run the local solve with a different set of
>>> options form the others. Is there any way to set options for each
>>> subdomain?
>>>
>>
>> I can see two ways:
>>
>>   1) Pull out the subsolver and set it using the API
>>
>>   2) Pull out the subsolver and give it a different prefix
>>
>>   Thanks,
>>
>>      Matt
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220617/7aec6ec5/attachment.html>

From bsmith at petsc.dev  Fri Jun 17 09:00:53 2022
From: bsmith at petsc.dev (Barry Smith)
Date: Fri, 17 Jun 2022 10:00:53 -0400
Subject: [petsc-users] Customizing NASM subsnes
In-Reply-To: <CAMYG4Gk7pBSSSZU3Y6Cx4Rncr9NVeh_vK=9z8j98yaW7uABYKA@mail.gmail.com>
References: <62aba746.1c69fb81.7df46.678d@mx.google.com>
	<CAMYG4Gm7+vpJZ9dSDaivKYStsAA8zoZr1-gWKWh7ROKXsFWWRw@mail.gmail.com>
	<CAJOF-fn_QKBLCEeCACY9_UomBBJqms5WscBCoeinCzU+yXrkEQ@mail.gmail.com>
	<CAMYG4Gk7pBSSSZU3Y6Cx4Rncr9NVeh_vK=9z8j98yaW7uABYKA@mail.gmail.com>
Message-ID: <E50DADD3-187C-4F3A-8DF7-C78055D7C785@petsc.dev>


     MPI_Comm_size(PETSC_COMM_WORLD,&size);
     MPI_Comm_rank(PETSC_COMM_WORLD,&rank);
> SNESNASMGetSNES(snes,0,&subsnes);
>  char prefix[10];
>  sprintf(prefix,"sub_%d_",rank);
>  SNESSetOptionsPrefix(subsnes,prefix);



> On Jun 17, 2022, at 9:35 AM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Fri, Jun 17, 2022 at 9:22 AM Takahashi, Tadanaga <tt73 at njit.edu <mailto:tt73 at njit.edu>> wrote:
> I'm having some trouble pulling out the subsolver. I tried to use SNESNASMGetSNES in a loop over each subdomain. However I get an error when I run the code with more than one MPI processors. Here is a snippet from my code: 
> 
>    SNES           snes, subsnes;
>    PetscMPIInt    rank, size;
>    ... 
>    ierr = SNESCreate(PETSC_COMM_WORLD,&snes); CHKERRQ(ierr);
>    ierr = SNESSetType(snes,SNESNASM); CHKERRQ(ierr);
>    ierr = SNESNASMSetType(snes,PC_ASM_RESTRICT); CHKERRQ(ierr);
>    ... 
>    ierr = SNESSetFromOptions(snes); CHKERRQ(ierr);
>    ierr = SNESSetUp(snes); CHKERRQ(ierr);
>    PetscPrintf(PETSC_COMM_WORLD, "Size = %d\n",size);
>    for (i=0; i<size; i++) {
>       PetscPrintf(PETSC_COMM_WORLD, "rank = %d\n",i);
>       SNESNASMGetSNES(snes,i,&subsnes);
>       // char prefix[10];
>       // sprintf(prefix,"sub_%d_",i);
>       // SNESSetOptionsPrefix(subsnes,prefix);
>    }
>    ... 
>    ierr = SNESSolve(snes,NULL,u_initial); CHKERRQ(ierr);
> 
> 
> And, here is the output of the code when I run with 2 MPI procs: 
> 
> SNESNASMGetSNES() gets the local subsolvers. It seems you only have one per process.
> You can check https://petsc.org/main/docs/manualpages/SNES/SNESNASMGetNumber/ <https://petsc.org/main/docs/manualpages/SNES/SNESNASMGetNumber/>
> 
> Notice that your current code will not work because, according to your explanation, you only want to change
> the prefix on a single rank, so you need to check the rank when you do it.
> 
>   Thanks,
> 
>      Matt
>  
> takahashi at ubuntu:~/Desktop/MA-DDM/C/Rectangle$ mpiexec -n 2 ./test1
> Size = 2
> rank = 0
> rank = 1
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR: Argument out of range
> [0]PETSC ERROR: No such subsolver
> [0]PETSC ERROR: See https://petsc.org/release/faq/ <https://petsc.org/release/faq/> for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.17.1, unknown 
> [0]PETSC ERROR: ./test1 on a linux-gnu-c-debug named ubuntu by takahashi Fri Jun 17 06:06:38 2022
> [0]PETSC ERROR: Configure options --with-mpi-dir=/usr --with-fc=0
> [0]PETSC ERROR: #1 SNESNASMGetSNES() at /home/takahashi/Desktop/petsc/src/snes/impls/nasm/nasm.c:923
> 
> ===================================================================================
> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> =   RANK 0 PID 976566 RUNNING AT ubuntu
> =   KILLED BY SIGNAL: 9 (Killed)
> ===================================================================================
> 
> This error doesn't occur when I run this without MPI. However, I tried to change the prefix of the subdomain to `sub_0_` but I am not able to change the snes_type using this prefix. Running ./test1 -snes_view -help | grep sub_0_snes_type prints nothing. 
> 
> On Thu, Jun 16, 2022 at 6:23 PM Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>> wrote:
> On Thu, Jun 16, 2022 at 5:57 PM tt73 <tt73 at njit.edu <mailto:tt73 at njit.edu>> wrote:
> 
> Hi, 
> 
> I am using  NASM as the outer solver for a nonlinear problem. For one of the subdomains, I want to run the local solve with a different set of options form the others. Is there any way to set options for each subdomain? 
> 
> I can see two ways:
> 
>   1) Pull out the subsolver and set it using the API
> 
>   2) Pull out the subsolver and give it a different prefix
> 
>   Thanks,
> 
>      Matt
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220617/2c986fda/attachment-0001.html>

From tt73 at njit.edu  Fri Jun 17 09:47:25 2022
From: tt73 at njit.edu (Takahashi, Tadanaga)
Date: Fri, 17 Jun 2022 10:47:25 -0400
Subject: [petsc-users] Customizing NASM subsnes
In-Reply-To: <E50DADD3-187C-4F3A-8DF7-C78055D7C785@petsc.dev>
References: <62aba746.1c69fb81.7df46.678d@mx.google.com>
	<CAMYG4Gm7+vpJZ9dSDaivKYStsAA8zoZr1-gWKWh7ROKXsFWWRw@mail.gmail.com>
	<CAJOF-fn_QKBLCEeCACY9_UomBBJqms5WscBCoeinCzU+yXrkEQ@mail.gmail.com>
	<CAMYG4Gk7pBSSSZU3Y6Cx4Rncr9NVeh_vK=9z8j98yaW7uABYKA@mail.gmail.com>
	<E50DADD3-187C-4F3A-8DF7-C78055D7C785@petsc.dev>
Message-ID: <CAJOF-fkWf5CBN5OzS5BGo_-Q4tvVSzOnuG8hqCu0EbwAEva-Bw@mail.gmail.com>

Thank you. I am now able to pull each subsnes, change its snes type through
the API, and set a prefix. This is my updated code:

   SNES           snes, subsnes;
   PetscMPIInt    rank, size;
   ...
   ierr = SNESCreate(PETSC_COMM_WORLD,&snes); CHKERRQ(ierr);
   ierr = SNESSetType(snes,SNESNASM); CHKERRQ(ierr);
   ierr = SNESNASMSetType(snes,PC_ASM_RESTRICT); CHKERRQ(ierr);
   ...
   ierr = SNESSetFromOptions(snes); CHKERRQ(ierr);
   ierr = SNESSetUp(snes); CHKERRQ(ierr);
   PetscPrintf(PETSC_COMM_WORLD, "Size = %d\n",size);
   PetscBarrier(NULL);
   for (i=0; i<size; i++) {
      char prefix[10];
      sprintf(prefix,"sub_%d_",i);
      if(i==rank) {
         ierr = SNESNASMGetNumber(snes,&Nd);
         printf("rank = %d has %d block(s)\n",i,Nd);
         if (i <size-1) {
            SNESNASMGetSNES(snes,0,&subsnes);
            SNESSetType(subsnes,SNESNEWTONLS); CHKERRQ(ierr); // newton for
regular domains
         } else {
            SNESNASMGetSNES(snes,0,&subsnes);
            SNESSetType(subsnes,SNESFAS); CHKERRQ(ierr); // fas for last
domain
         }
         SNESSetOptionsPrefix(subsnes,prefix);
      }
   }
   ierr = SNESSetFromOptions(snes); CHKERRQ(ierr);
   ...
   ierr = SNESSolve(snes,NULL,u_initial); CHKERRQ(ierr);

However, I still cannot change SNES, KSP, and PC types for
individual domains through the command arguments. I checked the subdomains
with -snes_view ::ascii_info_detail and it does show that the prefixes are
properly changed. It also shows that the SNES type for the last domain was
successfully changed. But for some reason, I only have access to the SNES
viewer options during runtime. For example, if I run mpiexec -n 4 ./test1
-sub_0_ksp_type gmres -help | grep sub_0 I get the output:

Viewer (-sub_0_snes_convergence_estimate) options:
  -sub_0_snes_convergence_estimate ascii[:[filename][:[format][:append]]]:
Prints object to stdout or ASCII file (PetscOptionsGetViewer)
  -sub_0_snes_convergence_estimate binary[:[filename][:[format][:append]]]:
Saves object to a binary file (PetscOptionsGetViewer)
  -sub_0_snes_convergence_estimate draw[:[drawtype][:filename|format]]
Draws object (PetscOptionsGetViewer)
  -sub_0_snes_convergence_estimate socket[:port]: Pushes object to a Unix
socket (PetscOptionsGetViewer)
  -sub_0_snes_convergence_estimate saws[:communicatorname]: Publishes
object to SAWs (PetscOptionsGetViewer)
Viewer (-sub_0_snes_view_pre) options:
  -sub_0_snes_view_pre ascii[:[filename][:[format][:append]]]: Prints
object to stdout or ASCII file (PetscOptionsGetViewer)
  -sub_0_snes_view_pre binary[:[filename][:[format][:append]]]: Saves
object to a binary file (PetscOptionsGetViewer)
  -sub_0_snes_view_pre draw[:[drawtype][:filename|format]] Draws object
(PetscOptionsGetViewer)
  -sub_0_snes_view_pre socket[:port]: Pushes object to a Unix socket
(PetscOptionsGetViewer)
  -sub_0_snes_view_pre saws[:communicatorname]: Publishes object to SAWs
(PetscOptionsGetViewer)
Viewer (-sub_0_snes_test_jacobian_view) options:
  -sub_0_snes_test_jacobian_view ascii[:[filename][:[format][:append]]]:
Prints object to stdout or ASCII file (PetscOptionsGetViewer)
  -sub_0_snes_test_jacobian_view binary[:[filename][:[format][:append]]]:
Saves object to a binary file (PetscOptionsGetViewer)
  -sub_0_snes_test_jacobian_view draw[:[drawtype][:filename|format]] Draws
object (PetscOptionsGetViewer)
  -sub_0_snes_test_jacobian_view socket[:port]: Pushes object to a Unix
socket (PetscOptionsGetViewer)
  -sub_0_snes_test_jacobian_view saws[:communicatorname]: Publishes object
to SAWs (PetscOptionsGetViewer)
Viewer (-sub_0_snes_test_jacobian_display) options:
  -sub_0_snes_test_jacobian_display ascii[:[filename][:[format][:append]]]:
Prints object to stdout or ASCII file (PetscOptionsGetViewer)
  -sub_0_snes_test_jacobian_display
binary[:[filename][:[format][:append]]]: Saves object to a binary file
(PetscOptionsGetViewer)
  -sub_0_snes_test_jacobian_display draw[:[drawtype][:filename|format]]
Draws object (PetscOptionsGetViewer)
  -sub_0_snes_test_jacobian_display socket[:port]: Pushes object to a Unix
socket (PetscOptionsGetViewer)
  -sub_0_snes_test_jacobian_display saws[:communicatorname]: Publishes
object to SAWs (PetscOptionsGetViewer)
Viewer (-sub_0_ksp_converged_reason) options:
  -sub_0_ksp_converged_reason ascii[:[filename][:[format][:append]]]:
Prints object to stdout or ASCII file (PetscOptionsGetViewer)
  -sub_0_ksp_converged_reason binary[:[filename][:[format][:append]]]:
Saves object to a binary file (PetscOptionsGetViewer)
  -sub_0_ksp_converged_reason draw[:[drawtype][:filename|format]] Draws
object (PetscOptionsGetViewer)
  -sub_0_ksp_converged_reason socket[:port]: Pushes object to a Unix socket
(PetscOptionsGetViewer)
  -sub_0_ksp_converged_reason saws[:communicatorname]: Publishes object to
SAWs (PetscOptionsGetViewer)
Viewer (-sub_0_snes_converged_reason) options:
  -sub_0_snes_converged_reason ascii[:[filename][:[format][:append]]]:
Prints object to stdout or ASCII file (PetscOptionsGetViewer)
  -sub_0_snes_converged_reason binary[:[filename][:[format][:append]]]:
Saves object to a binary file (PetscOptionsGetViewer)
  -sub_0_snes_converged_reason draw[:[drawtype][:filename|format]] Draws
object (PetscOptionsGetViewer)
  -sub_0_snes_converged_reason socket[:port]: Pushes object to a Unix
socket (PetscOptionsGetViewer)
  -sub_0_snes_converged_reason saws[:communicatorname]: Publishes object to
SAWs (PetscOptionsGetViewer)
Viewer (-sub_0_snes_view) options:
  -sub_0_snes_view ascii[:[filename][:[format][:append]]]: Prints object to
stdout or ASCII file (PetscOptionsGetViewer)
  -sub_0_snes_view binary[:[filename][:[format][:append]]]: Saves object to
a binary file (PetscOptionsGetViewer)
  -sub_0_snes_view draw[:[drawtype][:filename|format]] Draws object
(PetscOptionsGetViewer)
  -sub_0_snes_view socket[:port]: Pushes object to a Unix socket
(PetscOptionsGetViewer)
  -sub_0_snes_view saws[:communicatorname]: Publishes object to SAWs
(PetscOptionsGetViewer)
Viewer (-sub_0_snes_view_solution) options:
  -sub_0_snes_view_solution ascii[:[filename][:[format][:append]]]: Prints
object to stdout or ASCII file (PetscOptionsGetViewer)
  -sub_0_snes_view_solution binary[:[filename][:[format][:append]]]: Saves
object to a binary file (PetscOptionsGetViewer)
  -sub_0_snes_view_solution draw[:[drawtype][:filename|format]] Draws
object (PetscOptionsGetViewer)
  -sub_0_snes_view_solution socket[:port]: Pushes object to a Unix socket
(PetscOptionsGetViewer)
  -sub_0_snes_view_solution saws[:communicatorname]: Publishes object to
SAWs (PetscOptionsGetViewer)
Option left: name:-sub_0_ksp_type value: gmres


Do you know what could be causing this?

On Fri, Jun 17, 2022 at 10:00 AM Barry Smith <bsmith at petsc.dev> wrote:

>
>      MPI_Comm_size(PETSC_COMM_WORLD,&size);
>      MPI_Comm_rank(PETSC_COMM_WORLD,&rank);
>
> SNESNASMGetSNES(snes,0,&subsnes);
>>  char prefix[10];
>>  sprintf(prefix,"sub_%d_",rank);
>>  SNESSetOptionsPrefix(subsnes,prefix);
>>
>
>
>
> On Jun 17, 2022, at 9:35 AM, Matthew Knepley <knepley at gmail.com> wrote:
>
> On Fri, Jun 17, 2022 at 9:22 AM Takahashi, Tadanaga <tt73 at njit.edu> wrote:
>
>> I'm having some trouble pulling out the subsolver. I tried to use
>> SNESNASMGetSNES in a loop over each subdomain. However I get an error when
>> I run the code with more than one MPI processors. Here is a snippet from my
>> code:
>>
>>    SNES           snes, subsnes;
>>    PetscMPIInt    rank, size;
>>    ...
>>    ierr = SNESCreate(PETSC_COMM_WORLD,&snes); CHKERRQ(ierr);
>>    ierr = SNESSetType(snes,SNESNASM); CHKERRQ(ierr);
>>    ierr = SNESNASMSetType(snes,PC_ASM_RESTRICT); CHKERRQ(ierr);
>>    ...
>>    ierr = SNESSetFromOptions(snes); CHKERRQ(ierr);
>>    ierr = SNESSetUp(snes); CHKERRQ(ierr);
>>    PetscPrintf(PETSC_COMM_WORLD, "Size = %d\n",size);
>>    for (i=0; i<size; i++) {
>>       PetscPrintf(PETSC_COMM_WORLD, "rank = %d\n",i);
>>       SNESNASMGetSNES(snes,i,&subsnes);
>>       // char prefix[10];
>>       // sprintf(prefix,"sub_%d_",i);
>>       // SNESSetOptionsPrefix(subsnes,prefix);
>>    }
>>    ...
>>    ierr = SNESSolve(snes,NULL,u_initial); CHKERRQ(ierr);
>>
>>
>> And, here is the output of the code when I run with 2 MPI procs:
>>
>
> SNESNASMGetSNES() gets the local subsolvers. It seems you only have one
> per process.
> You can check
> https://petsc.org/main/docs/manualpages/SNES/SNESNASMGetNumber/
>
> Notice that your current code will not work because, according to your
> explanation, you only want to change
> the prefix on a single rank, so you need to check the rank when you do it.
>
>   Thanks,
>
>      Matt
>
>
>> takahashi at ubuntu:~/Desktop/MA-DDM/C/Rectangle$ mpiexec -n 2 ./test1
>> Size = 2
>> rank = 0
>> rank = 1
>> [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> [0]PETSC ERROR: Argument out of range
>> [0]PETSC ERROR: No such subsolver
>> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
>> [0]PETSC ERROR: Petsc Release Version 3.17.1, unknown
>> [0]PETSC ERROR: ./test1 on a linux-gnu-c-debug named ubuntu by takahashi
>> Fri Jun 17 06:06:38 2022
>> [0]PETSC ERROR: Configure options --with-mpi-dir=/usr --with-fc=0
>> [0]PETSC ERROR: #1 SNESNASMGetSNES() at
>> /home/takahashi/Desktop/petsc/src/snes/impls/nasm/nasm.c:923
>>
>>
>> ===================================================================================
>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>> =   RANK 0 PID 976566 RUNNING AT ubuntu
>> =   KILLED BY SIGNAL: 9 (Killed)
>>
>> ===================================================================================
>>
>> This error doesn't occur when I run this without MPI. However, I tried to
>> change the prefix of the subdomain to `sub_0_` but I am not able to change
>> the snes_type using this prefix. Running ./test1 -snes_view -help | grep
>> sub_0_snes_type prints nothing.
>>
>> On Thu, Jun 16, 2022 at 6:23 PM Matthew Knepley <knepley at gmail.com>
>> wrote:
>>
>>> On Thu, Jun 16, 2022 at 5:57 PM tt73 <tt73 at njit.edu> wrote:
>>>
>>>>
>>>> Hi,
>>>>
>>>> I am using  NASM as the outer solver for a nonlinear problem. For one
>>>> of the subdomains, I want to run the local solve with a different set of
>>>> options form the others. Is there any way to set options for each
>>>> subdomain?
>>>>
>>>
>>> I can see two ways:
>>>
>>>   1) Pull out the subsolver and set it using the API
>>>
>>>   2) Pull out the subsolver and give it a different prefix
>>>
>>>   Thanks,
>>>
>>>      Matt
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://www.cse.buffalo.edu/~knepley/
>>> <http://www.cse.buffalo.edu/~knepley/>
>>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220617/c55d5af1/attachment.html>

From bsmith at petsc.dev  Fri Jun 17 10:02:05 2022
From: bsmith at petsc.dev (Barry Smith)
Date: Fri, 17 Jun 2022 11:02:05 -0400
Subject: [petsc-users] Customizing NASM subsnes
In-Reply-To: <CAJOF-fkWf5CBN5OzS5BGo_-Q4tvVSzOnuG8hqCu0EbwAEva-Bw@mail.gmail.com>
References: <62aba746.1c69fb81.7df46.678d@mx.google.com>
	<CAMYG4Gm7+vpJZ9dSDaivKYStsAA8zoZr1-gWKWh7ROKXsFWWRw@mail.gmail.com>
	<CAJOF-fn_QKBLCEeCACY9_UomBBJqms5WscBCoeinCzU+yXrkEQ@mail.gmail.com>
	<CAMYG4Gk7pBSSSZU3Y6Cx4Rncr9NVeh_vK=9z8j98yaW7uABYKA@mail.gmail.com>
	<E50DADD3-187C-4F3A-8DF7-C78055D7C785@petsc.dev>
	<CAJOF-fkWf5CBN5OzS5BGo_-Q4tvVSzOnuG8hqCu0EbwAEva-Bw@mail.gmail.com>
Message-ID: <DF2CCD96-56B3-445D-B324-05CD246B23A8@petsc.dev>


  You do not need the loop over size. Each rank sets options and options prefix for its local objects and never anyone elses. 

>        char prefix[10];
>        sprintf(prefix,"sub_%d_",rank);
>        SNESNASMGetSNES(snes,0,&subsnes);
>          if (rank <size-1) {
>             SNESSetType(subsnes,SNESNEWTONLS); CHKERRQ(ierr); // newton for regular domains
>          } else {
>             SNESSetType(subsnes,SNESFAS); CHKERRQ(ierr); // fas for last domain 
>          }
>          SNESSetOptionsPrefix(subsnes,prefix);
>       }
>    }

   To get the prefix to work try calling   SNESSetFromOptions(subsnes); immediately after your SNESSetOptionsPrefix(subsnes,prefix); call

   Matt, it looks like there may be a bug in NASM, except in one particular case, it never calls SNESSetFromOptions() on the subsenses.

  Barry


> On Jun 17, 2022, at 10:47 AM, Takahashi, Tadanaga <tt73 at njit.edu> wrote:
> 
> Thank you. I am now able to pull each subsnes, change its snes type through the API, and set a prefix. This is my updated code: 
> 
>    SNES           snes, subsnes;
>    PetscMPIInt    rank, size;
>    ... 
>    ierr = SNESCreate(PETSC_COMM_WORLD,&snes); CHKERRQ(ierr);
>    ierr = SNESSetType(snes,SNESNASM); CHKERRQ(ierr);
>    ierr = SNESNASMSetType(snes,PC_ASM_RESTRICT); CHKERRQ(ierr);
>    ... 
>    ierr = SNESSetFromOptions(snes); CHKERRQ(ierr);
>    ierr = SNESSetUp(snes); CHKERRQ(ierr);
>    PetscPrintf(PETSC_COMM_WORLD, "Size = %d\n",size);
>    PetscBarrier(NULL);
>    for (i=0; i<size; i++) {
>       char prefix[10];
>       sprintf(prefix,"sub_%d_",i);
>       if(i==rank) {
>          ierr = SNESNASMGetNumber(snes,&Nd);
>          printf("rank = %d has %d block(s)\n",i,Nd);
>          if (i <size-1) {
>             SNESNASMGetSNES(snes,0,&subsnes);
>             SNESSetType(subsnes,SNESNEWTONLS); CHKERRQ(ierr); // newton for regular domains
>          } else {
>             SNESNASMGetSNES(snes,0,&subsnes);
>             SNESSetType(subsnes,SNESFAS); CHKERRQ(ierr); // fas for last domain 
>          }
>          SNESSetOptionsPrefix(subsnes,prefix);
>       }
>    }
>    ierr = SNESSetFromOptions(snes); CHKERRQ(ierr);
>    ... 
>    ierr = SNESSolve(snes,NULL,u_initial); CHKERRQ(ierr);
> 
> However, I still cannot change SNES, KSP, and PC types for individual domains through the command arguments. I checked the subdomains with -snes_view ::ascii_info_detail and it does show that the prefixes are properly changed. It also shows that the SNES type for the last domain was successfully changed. But for some reason, I only have access to the SNES viewer options during runtime. For example, if I run mpiexec -n 4 ./test1 -sub_0_ksp_type gmres -help | grep sub_0 I get the output: 
> 
> Viewer (-sub_0_snes_convergence_estimate) options:
>   -sub_0_snes_convergence_estimate ascii[:[filename][:[format][:append]]]: Prints object to stdout or ASCII file (PetscOptionsGetViewer)
>   -sub_0_snes_convergence_estimate binary[:[filename][:[format][:append]]]: Saves object to a binary file (PetscOptionsGetViewer)
>   -sub_0_snes_convergence_estimate draw[:[drawtype][:filename|format]] Draws object (PetscOptionsGetViewer)
>   -sub_0_snes_convergence_estimate socket[:port]: Pushes object to a Unix socket (PetscOptionsGetViewer)
>   -sub_0_snes_convergence_estimate saws[:communicatorname]: Publishes object to SAWs (PetscOptionsGetViewer)
> Viewer (-sub_0_snes_view_pre) options:
>   -sub_0_snes_view_pre ascii[:[filename][:[format][:append]]]: Prints object to stdout or ASCII file (PetscOptionsGetViewer)
>   -sub_0_snes_view_pre binary[:[filename][:[format][:append]]]: Saves object to a binary file (PetscOptionsGetViewer)
>   -sub_0_snes_view_pre draw[:[drawtype][:filename|format]] Draws object (PetscOptionsGetViewer)
>   -sub_0_snes_view_pre socket[:port]: Pushes object to a Unix socket (PetscOptionsGetViewer)
>   -sub_0_snes_view_pre saws[:communicatorname]: Publishes object to SAWs (PetscOptionsGetViewer)
> Viewer (-sub_0_snes_test_jacobian_view) options:
>   -sub_0_snes_test_jacobian_view ascii[:[filename][:[format][:append]]]: Prints object to stdout or ASCII file (PetscOptionsGetViewer)
>   -sub_0_snes_test_jacobian_view binary[:[filename][:[format][:append]]]: Saves object to a binary file (PetscOptionsGetViewer)
>   -sub_0_snes_test_jacobian_view draw[:[drawtype][:filename|format]] Draws object (PetscOptionsGetViewer)
>   -sub_0_snes_test_jacobian_view socket[:port]: Pushes object to a Unix socket (PetscOptionsGetViewer)
>   -sub_0_snes_test_jacobian_view saws[:communicatorname]: Publishes object to SAWs (PetscOptionsGetViewer)
> Viewer (-sub_0_snes_test_jacobian_display) options:
>   -sub_0_snes_test_jacobian_display ascii[:[filename][:[format][:append]]]: Prints object to stdout or ASCII file (PetscOptionsGetViewer)
>   -sub_0_snes_test_jacobian_display binary[:[filename][:[format][:append]]]: Saves object to a binary file (PetscOptionsGetViewer)
>   -sub_0_snes_test_jacobian_display draw[:[drawtype][:filename|format]] Draws object (PetscOptionsGetViewer)
>   -sub_0_snes_test_jacobian_display socket[:port]: Pushes object to a Unix socket (PetscOptionsGetViewer)
>   -sub_0_snes_test_jacobian_display saws[:communicatorname]: Publishes object to SAWs (PetscOptionsGetViewer)
> Viewer (-sub_0_ksp_converged_reason) options:
>   -sub_0_ksp_converged_reason ascii[:[filename][:[format][:append]]]: Prints object to stdout or ASCII file (PetscOptionsGetViewer)
>   -sub_0_ksp_converged_reason binary[:[filename][:[format][:append]]]: Saves object to a binary file (PetscOptionsGetViewer)
>   -sub_0_ksp_converged_reason draw[:[drawtype][:filename|format]] Draws object (PetscOptionsGetViewer)
>   -sub_0_ksp_converged_reason socket[:port]: Pushes object to a Unix socket (PetscOptionsGetViewer)
>   -sub_0_ksp_converged_reason saws[:communicatorname]: Publishes object to SAWs (PetscOptionsGetViewer)
> Viewer (-sub_0_snes_converged_reason) options:
>   -sub_0_snes_converged_reason ascii[:[filename][:[format][:append]]]: Prints object to stdout or ASCII file (PetscOptionsGetViewer)
>   -sub_0_snes_converged_reason binary[:[filename][:[format][:append]]]: Saves object to a binary file (PetscOptionsGetViewer)
>   -sub_0_snes_converged_reason draw[:[drawtype][:filename|format]] Draws object (PetscOptionsGetViewer)
>   -sub_0_snes_converged_reason socket[:port]: Pushes object to a Unix socket (PetscOptionsGetViewer)
>   -sub_0_snes_converged_reason saws[:communicatorname]: Publishes object to SAWs (PetscOptionsGetViewer)
> Viewer (-sub_0_snes_view) options:
>   -sub_0_snes_view ascii[:[filename][:[format][:append]]]: Prints object to stdout or ASCII file (PetscOptionsGetViewer)
>   -sub_0_snes_view binary[:[filename][:[format][:append]]]: Saves object to a binary file (PetscOptionsGetViewer)
>   -sub_0_snes_view draw[:[drawtype][:filename|format]] Draws object (PetscOptionsGetViewer)
>   -sub_0_snes_view socket[:port]: Pushes object to a Unix socket (PetscOptionsGetViewer)
>   -sub_0_snes_view saws[:communicatorname]: Publishes object to SAWs (PetscOptionsGetViewer)
> Viewer (-sub_0_snes_view_solution) options:
>   -sub_0_snes_view_solution ascii[:[filename][:[format][:append]]]: Prints object to stdout or ASCII file (PetscOptionsGetViewer)
>   -sub_0_snes_view_solution binary[:[filename][:[format][:append]]]: Saves object to a binary file (PetscOptionsGetViewer)
>   -sub_0_snes_view_solution draw[:[drawtype][:filename|format]] Draws object (PetscOptionsGetViewer)
>   -sub_0_snes_view_solution socket[:port]: Pushes object to a Unix socket (PetscOptionsGetViewer)
>   -sub_0_snes_view_solution saws[:communicatorname]: Publishes object to SAWs (PetscOptionsGetViewer)
> Option left: name:-sub_0_ksp_type value: gmres
> 
> Do you know what could be causing this? 
> 
> On Fri, Jun 17, 2022 at 10:00 AM Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> wrote:
> 
>      MPI_Comm_size(PETSC_COMM_WORLD,&size);
>      MPI_Comm_rank(PETSC_COMM_WORLD,&rank);
>> SNESNASMGetSNES(snes,0,&subsnes);
>>  char prefix[10];
>>  sprintf(prefix,"sub_%d_",rank);
>>  SNESSetOptionsPrefix(subsnes,prefix);
> 
> 
> 
>> On Jun 17, 2022, at 9:35 AM, Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>> wrote:
>> 
>> On Fri, Jun 17, 2022 at 9:22 AM Takahashi, Tadanaga <tt73 at njit.edu <mailto:tt73 at njit.edu>> wrote:
>> I'm having some trouble pulling out the subsolver. I tried to use SNESNASMGetSNES in a loop over each subdomain. However I get an error when I run the code with more than one MPI processors. Here is a snippet from my code: 
>> 
>>    SNES           snes, subsnes;
>>    PetscMPIInt    rank, size;
>>    ... 
>>    ierr = SNESCreate(PETSC_COMM_WORLD,&snes); CHKERRQ(ierr);
>>    ierr = SNESSetType(snes,SNESNASM); CHKERRQ(ierr);
>>    ierr = SNESNASMSetType(snes,PC_ASM_RESTRICT); CHKERRQ(ierr);
>>    ... 
>>    ierr = SNESSetFromOptions(snes); CHKERRQ(ierr);
>>    ierr = SNESSetUp(snes); CHKERRQ(ierr);
>>    PetscPrintf(PETSC_COMM_WORLD, "Size = %d\n",size);
>>    for (i=0; i<size; i++) {
>>       PetscPrintf(PETSC_COMM_WORLD, "rank = %d\n",i);
>>       SNESNASMGetSNES(snes,i,&subsnes);
>>       // char prefix[10];
>>       // sprintf(prefix,"sub_%d_",i);
>>       // SNESSetOptionsPrefix(subsnes,prefix);
>>    }
>>    ... 
>>    ierr = SNESSolve(snes,NULL,u_initial); CHKERRQ(ierr);
>> 
>> 
>> And, here is the output of the code when I run with 2 MPI procs: 
>> 
>> SNESNASMGetSNES() gets the local subsolvers. It seems you only have one per process.
>> You can check https://petsc.org/main/docs/manualpages/SNES/SNESNASMGetNumber/ <https://petsc.org/main/docs/manualpages/SNES/SNESNASMGetNumber/>
>> 
>> Notice that your current code will not work because, according to your explanation, you only want to change
>> the prefix on a single rank, so you need to check the rank when you do it.
>> 
>>   Thanks,
>> 
>>      Matt
>>  
>> takahashi at ubuntu:~/Desktop/MA-DDM/C/Rectangle$ mpiexec -n 2 ./test1
>> Size = 2
>> rank = 0
>> rank = 1
>> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [0]PETSC ERROR: Argument out of range
>> [0]PETSC ERROR: No such subsolver
>> [0]PETSC ERROR: See https://petsc.org/release/faq/ <https://petsc.org/release/faq/> for trouble shooting.
>> [0]PETSC ERROR: Petsc Release Version 3.17.1, unknown 
>> [0]PETSC ERROR: ./test1 on a linux-gnu-c-debug named ubuntu by takahashi Fri Jun 17 06:06:38 2022
>> [0]PETSC ERROR: Configure options --with-mpi-dir=/usr --with-fc=0
>> [0]PETSC ERROR: #1 SNESNASMGetSNES() at /home/takahashi/Desktop/petsc/src/snes/impls/nasm/nasm.c:923
>> 
>> ===================================================================================
>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>> =   RANK 0 PID 976566 RUNNING AT ubuntu
>> =   KILLED BY SIGNAL: 9 (Killed)
>> ===================================================================================
>> 
>> This error doesn't occur when I run this without MPI. However, I tried to change the prefix of the subdomain to `sub_0_` but I am not able to change the snes_type using this prefix. Running ./test1 -snes_view -help | grep sub_0_snes_type prints nothing. 
>> 
>> On Thu, Jun 16, 2022 at 6:23 PM Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>> wrote:
>> On Thu, Jun 16, 2022 at 5:57 PM tt73 <tt73 at njit.edu <mailto:tt73 at njit.edu>> wrote:
>> 
>> Hi, 
>> 
>> I am using  NASM as the outer solver for a nonlinear problem. For one of the subdomains, I want to run the local solve with a different set of options form the others. Is there any way to set options for each subdomain? 
>> 
>> I can see two ways:
>> 
>>   1) Pull out the subsolver and set it using the API
>> 
>>   2) Pull out the subsolver and give it a different prefix
>> 
>>   Thanks,
>> 
>>      Matt
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
>> 
>> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
>> 
>> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220617/da41cdb5/attachment-0001.html>

From tt73 at njit.edu  Fri Jun 17 10:12:29 2022
From: tt73 at njit.edu (Takahashi, Tadanaga)
Date: Fri, 17 Jun 2022 11:12:29 -0400
Subject: [petsc-users] Customizing NASM subsnes
In-Reply-To: <DF2CCD96-56B3-445D-B324-05CD246B23A8@petsc.dev>
References: <62aba746.1c69fb81.7df46.678d@mx.google.com>
	<CAMYG4Gm7+vpJZ9dSDaivKYStsAA8zoZr1-gWKWh7ROKXsFWWRw@mail.gmail.com>
	<CAJOF-fn_QKBLCEeCACY9_UomBBJqms5WscBCoeinCzU+yXrkEQ@mail.gmail.com>
	<CAMYG4Gk7pBSSSZU3Y6Cx4Rncr9NVeh_vK=9z8j98yaW7uABYKA@mail.gmail.com>
	<E50DADD3-187C-4F3A-8DF7-C78055D7C785@petsc.dev>
	<CAJOF-fkWf5CBN5OzS5BGo_-Q4tvVSzOnuG8hqCu0EbwAEva-Bw@mail.gmail.com>
	<DF2CCD96-56B3-445D-B324-05CD246B23A8@petsc.dev>
Message-ID: <CAJOF-f=bkH2HGEKVQHnYW248=fZXpzBJ-W15c-4vvfokrgZjQw@mail.gmail.com>

Ahh, I understand now. I got rid of the loop. Adding
SNESSetFromOptions(subsnes) right after
SNESSetOptionsPrefix(subsnes,prefix) did not fix the issue.

On Fri, Jun 17, 2022 at 11:02 AM Barry Smith <bsmith at petsc.dev> wrote:

>
>   You do not need the loop over size. Each rank sets options and options
> prefix for its local objects and never anyone elses.
>
>        char prefix[10];
>
>        sprintf(prefix,"sub_%d_",rank);
>
>        SNESNASMGetSNES(snes,0,&subsnes);
>
>          if (rank <size-1) {
>             SNESSetType(subsnes,SNESNEWTONLS); CHKERRQ(ierr); // newton
> for regular domains
>          } else {
>             SNESSetType(subsnes,SNESFAS); CHKERRQ(ierr); // fas for last
> domain
>          }
>          SNESSetOptionsPrefix(subsnes,prefix);
>       }
>    }
>
>
>    To get the prefix to work try calling   SNESSetFromOptions(subsnes);
> immediately after your SNESSetOptionsPrefix(subsnes,prefix); call
>
>    Matt, it looks like there may be a bug in NASM, except in one
> particular case, it never calls SNESSetFromOptions() on the subsenses.
>
>   Barry
>
>
> On Jun 17, 2022, at 10:47 AM, Takahashi, Tadanaga <tt73 at njit.edu> wrote:
>
> Thank you. I am now able to pull each subsnes, change its snes type
> through the API, and set a prefix. This is my updated code:
>
>    SNES           snes, subsnes;
>    PetscMPIInt    rank, size;
>    ...
>    ierr = SNESCreate(PETSC_COMM_WORLD,&snes); CHKERRQ(ierr);
>    ierr = SNESSetType(snes,SNESNASM); CHKERRQ(ierr);
>    ierr = SNESNASMSetType(snes,PC_ASM_RESTRICT); CHKERRQ(ierr);
>    ...
>    ierr = SNESSetFromOptions(snes); CHKERRQ(ierr);
>    ierr = SNESSetUp(snes); CHKERRQ(ierr);
>    PetscPrintf(PETSC_COMM_WORLD, "Size = %d\n",size);
>    PetscBarrier(NULL);
>    for (i=0; i<size; i++) {
>       char prefix[10];
>       sprintf(prefix,"sub_%d_",i);
>       if(i==rank) {
>          ierr = SNESNASMGetNumber(snes,&Nd);
>          printf("rank = %d has %d block(s)\n",i,Nd);
>          if (i <size-1) {
>             SNESNASMGetSNES(snes,0,&subsnes);
>             SNESSetType(subsnes,SNESNEWTONLS); CHKERRQ(ierr); // newton
> for regular domains
>          } else {
>             SNESNASMGetSNES(snes,0,&subsnes);
>             SNESSetType(subsnes,SNESFAS); CHKERRQ(ierr); // fas for last
> domain
>          }
>          SNESSetOptionsPrefix(subsnes,prefix);
>       }
>    }
>    ierr = SNESSetFromOptions(snes); CHKERRQ(ierr);
>    ...
>    ierr = SNESSolve(snes,NULL,u_initial); CHKERRQ(ierr);
>
> However, I still cannot change SNES, KSP, and PC types for
> individual domains through the command arguments. I checked the subdomains
> with -snes_view ::ascii_info_detail and it does show that the prefixes
> are properly changed. It also shows that the SNES type for the last domain
> was successfully changed. But for some reason, I only have access to the
> SNES viewer options during runtime. For example, if I run mpiexec -n 4
> ./test1 -sub_0_ksp_type gmres -help | grep sub_0 I get the output:
>
> Viewer (-sub_0_snes_convergence_estimate) options:
>   -sub_0_snes_convergence_estimate ascii[:[filename][:[format][:append]]]:
> Prints object to stdout or ASCII file (PetscOptionsGetViewer)
>   -sub_0_snes_convergence_estimate
> binary[:[filename][:[format][:append]]]: Saves object to a binary file
> (PetscOptionsGetViewer)
>   -sub_0_snes_convergence_estimate draw[:[drawtype][:filename|format]]
> Draws object (PetscOptionsGetViewer)
>   -sub_0_snes_convergence_estimate socket[:port]: Pushes object to a Unix
> socket (PetscOptionsGetViewer)
>   -sub_0_snes_convergence_estimate saws[:communicatorname]: Publishes
> object to SAWs (PetscOptionsGetViewer)
> Viewer (-sub_0_snes_view_pre) options:
>   -sub_0_snes_view_pre ascii[:[filename][:[format][:append]]]: Prints
> object to stdout or ASCII file (PetscOptionsGetViewer)
>   -sub_0_snes_view_pre binary[:[filename][:[format][:append]]]: Saves
> object to a binary file (PetscOptionsGetViewer)
>   -sub_0_snes_view_pre draw[:[drawtype][:filename|format]] Draws object
> (PetscOptionsGetViewer)
>   -sub_0_snes_view_pre socket[:port]: Pushes object to a Unix socket
> (PetscOptionsGetViewer)
>   -sub_0_snes_view_pre saws[:communicatorname]: Publishes object to SAWs
> (PetscOptionsGetViewer)
> Viewer (-sub_0_snes_test_jacobian_view) options:
>   -sub_0_snes_test_jacobian_view ascii[:[filename][:[format][:append]]]:
> Prints object to stdout or ASCII file (PetscOptionsGetViewer)
>   -sub_0_snes_test_jacobian_view binary[:[filename][:[format][:append]]]:
> Saves object to a binary file (PetscOptionsGetViewer)
>   -sub_0_snes_test_jacobian_view draw[:[drawtype][:filename|format]] Draws
> object (PetscOptionsGetViewer)
>   -sub_0_snes_test_jacobian_view socket[:port]: Pushes object to a Unix
> socket (PetscOptionsGetViewer)
>   -sub_0_snes_test_jacobian_view saws[:communicatorname]: Publishes object
> to SAWs (PetscOptionsGetViewer)
> Viewer (-sub_0_snes_test_jacobian_display) options:
>   -sub_0_snes_test_jacobian_display
> ascii[:[filename][:[format][:append]]]: Prints object to stdout or ASCII
> file (PetscOptionsGetViewer)
>   -sub_0_snes_test_jacobian_display
> binary[:[filename][:[format][:append]]]: Saves object to a binary file
> (PetscOptionsGetViewer)
>   -sub_0_snes_test_jacobian_display draw[:[drawtype][:filename|format]]
> Draws object (PetscOptionsGetViewer)
>   -sub_0_snes_test_jacobian_display socket[:port]: Pushes object to a Unix
> socket (PetscOptionsGetViewer)
>   -sub_0_snes_test_jacobian_display saws[:communicatorname]: Publishes
> object to SAWs (PetscOptionsGetViewer)
> Viewer (-sub_0_ksp_converged_reason) options:
>   -sub_0_ksp_converged_reason ascii[:[filename][:[format][:append]]]:
> Prints object to stdout or ASCII file (PetscOptionsGetViewer)
>   -sub_0_ksp_converged_reason binary[:[filename][:[format][:append]]]:
> Saves object to a binary file (PetscOptionsGetViewer)
>   -sub_0_ksp_converged_reason draw[:[drawtype][:filename|format]] Draws
> object (PetscOptionsGetViewer)
>   -sub_0_ksp_converged_reason socket[:port]: Pushes object to a Unix
> socket (PetscOptionsGetViewer)
>   -sub_0_ksp_converged_reason saws[:communicatorname]: Publishes object to
> SAWs (PetscOptionsGetViewer)
> Viewer (-sub_0_snes_converged_reason) options:
>   -sub_0_snes_converged_reason ascii[:[filename][:[format][:append]]]:
> Prints object to stdout or ASCII file (PetscOptionsGetViewer)
>   -sub_0_snes_converged_reason binary[:[filename][:[format][:append]]]:
> Saves object to a binary file (PetscOptionsGetViewer)
>   -sub_0_snes_converged_reason draw[:[drawtype][:filename|format]] Draws
> object (PetscOptionsGetViewer)
>   -sub_0_snes_converged_reason socket[:port]: Pushes object to a Unix
> socket (PetscOptionsGetViewer)
>   -sub_0_snes_converged_reason saws[:communicatorname]: Publishes object
> to SAWs (PetscOptionsGetViewer)
> Viewer (-sub_0_snes_view) options:
>   -sub_0_snes_view ascii[:[filename][:[format][:append]]]: Prints object
> to stdout or ASCII file (PetscOptionsGetViewer)
>   -sub_0_snes_view binary[:[filename][:[format][:append]]]: Saves object
> to a binary file (PetscOptionsGetViewer)
>   -sub_0_snes_view draw[:[drawtype][:filename|format]] Draws object
> (PetscOptionsGetViewer)
>   -sub_0_snes_view socket[:port]: Pushes object to a Unix socket
> (PetscOptionsGetViewer)
>   -sub_0_snes_view saws[:communicatorname]: Publishes object to SAWs
> (PetscOptionsGetViewer)
> Viewer (-sub_0_snes_view_solution) options:
>   -sub_0_snes_view_solution ascii[:[filename][:[format][:append]]]: Prints
> object to stdout or ASCII file (PetscOptionsGetViewer)
>   -sub_0_snes_view_solution binary[:[filename][:[format][:append]]]: Saves
> object to a binary file (PetscOptionsGetViewer)
>   -sub_0_snes_view_solution draw[:[drawtype][:filename|format]] Draws
> object (PetscOptionsGetViewer)
>   -sub_0_snes_view_solution socket[:port]: Pushes object to a Unix socket
> (PetscOptionsGetViewer)
>   -sub_0_snes_view_solution saws[:communicatorname]: Publishes object to
> SAWs (PetscOptionsGetViewer)
> Option left: name:-sub_0_ksp_type value: gmres
>
>
> Do you know what could be causing this?
>
> On Fri, Jun 17, 2022 at 10:00 AM Barry Smith <bsmith at petsc.dev> wrote:
>
>>
>>      MPI_Comm_size(PETSC_COMM_WORLD,&size);
>>      MPI_Comm_rank(PETSC_COMM_WORLD,&rank);
>>
>> SNESNASMGetSNES(snes,0,&subsnes);
>>>  char prefix[10];
>>>  sprintf(prefix,"sub_%d_",rank);
>>>  SNESSetOptionsPrefix(subsnes,prefix);
>>>
>>
>>
>>
>> On Jun 17, 2022, at 9:35 AM, Matthew Knepley <knepley at gmail.com> wrote:
>>
>> On Fri, Jun 17, 2022 at 9:22 AM Takahashi, Tadanaga <tt73 at njit.edu>
>> wrote:
>>
>>> I'm having some trouble pulling out the subsolver. I tried to use
>>> SNESNASMGetSNES in a loop over each subdomain. However I get an error when
>>> I run the code with more than one MPI processors. Here is a snippet from my
>>> code:
>>>
>>>    SNES           snes, subsnes;
>>>    PetscMPIInt    rank, size;
>>>    ...
>>>    ierr = SNESCreate(PETSC_COMM_WORLD,&snes); CHKERRQ(ierr);
>>>    ierr = SNESSetType(snes,SNESNASM); CHKERRQ(ierr);
>>>    ierr = SNESNASMSetType(snes,PC_ASM_RESTRICT); CHKERRQ(ierr);
>>>    ...
>>>    ierr = SNESSetFromOptions(snes); CHKERRQ(ierr);
>>>    ierr = SNESSetUp(snes); CHKERRQ(ierr);
>>>    PetscPrintf(PETSC_COMM_WORLD, "Size = %d\n",size);
>>>    for (i=0; i<size; i++) {
>>>       PetscPrintf(PETSC_COMM_WORLD, "rank = %d\n",i);
>>>       SNESNASMGetSNES(snes,i,&subsnes);
>>>       // char prefix[10];
>>>       // sprintf(prefix,"sub_%d_",i);
>>>       // SNESSetOptionsPrefix(subsnes,prefix);
>>>    }
>>>    ...
>>>    ierr = SNESSolve(snes,NULL,u_initial); CHKERRQ(ierr);
>>>
>>>
>>> And, here is the output of the code when I run with 2 MPI procs:
>>>
>>
>> SNESNASMGetSNES() gets the local subsolvers. It seems you only have one
>> per process.
>> You can check
>> https://petsc.org/main/docs/manualpages/SNES/SNESNASMGetNumber/
>>
>> Notice that your current code will not work because, according to your
>> explanation, you only want to change
>> the prefix on a single rank, so you need to check the rank when you do it.
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> takahashi at ubuntu:~/Desktop/MA-DDM/C/Rectangle$ mpiexec -n 2 ./test1
>>> Size = 2
>>> rank = 0
>>> rank = 1
>>> [0]PETSC ERROR: --------------------- Error Message
>>> --------------------------------------------------------------
>>> [0]PETSC ERROR: Argument out of range
>>> [0]PETSC ERROR: No such subsolver
>>> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
>>> [0]PETSC ERROR: Petsc Release Version 3.17.1, unknown
>>> [0]PETSC ERROR: ./test1 on a linux-gnu-c-debug named ubuntu by takahashi
>>> Fri Jun 17 06:06:38 2022
>>> [0]PETSC ERROR: Configure options --with-mpi-dir=/usr --with-fc=0
>>> [0]PETSC ERROR: #1 SNESNASMGetSNES() at
>>> /home/takahashi/Desktop/petsc/src/snes/impls/nasm/nasm.c:923
>>>
>>>
>>> ===================================================================================
>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>> =   RANK 0 PID 976566 RUNNING AT ubuntu
>>> =   KILLED BY SIGNAL: 9 (Killed)
>>>
>>> ===================================================================================
>>>
>>> This error doesn't occur when I run this without MPI. However, I tried
>>> to change the prefix of the subdomain to `sub_0_` but I am not able to
>>> change the snes_type using this prefix. Running ./test1 -snes_view
>>> -help | grep sub_0_snes_type prints nothing.
>>>
>>> On Thu, Jun 16, 2022 at 6:23 PM Matthew Knepley <knepley at gmail.com>
>>> wrote:
>>>
>>>> On Thu, Jun 16, 2022 at 5:57 PM tt73 <tt73 at njit.edu> wrote:
>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> I am using  NASM as the outer solver for a nonlinear problem. For one
>>>>> of the subdomains, I want to run the local solve with a different set of
>>>>> options form the others. Is there any way to set options for each
>>>>> subdomain?
>>>>>
>>>>
>>>> I can see two ways:
>>>>
>>>>   1) Pull out the subsolver and set it using the API
>>>>
>>>>   2) Pull out the subsolver and give it a different prefix
>>>>
>>>>   Thanks,
>>>>
>>>>      Matt
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> https://www.cse.buffalo.edu/~knepley/
>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>
>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220617/8b34b6c0/attachment-0001.html>

From dfatiac at gmail.com  Fri Jun 17 10:21:03 2022
From: dfatiac at gmail.com (Mario Rossi)
Date: Fri, 17 Jun 2022 17:21:03 +0200
Subject: [petsc-users] Parallel matrix-vector product with Matrix Shell
Message-ID: <CABcfRi2c+Jzw+o6PM0rN_vnCJ9_tdQ+zS2nqyK2mf_4f1jbaVA@mail.gmail.com>

I need to find the largest eigenvalues (say the first three) of a very
large matrix and I am using
a combination of PetSc and SLEPc. In particular, I am using a shell matrix.
I wrote a "custom"
matrix-vector product and everything works fine in serial (one task) mode
for a "small" case.
For the real case, I need multiple (at least 128) tasks for memory reasons
so I need a parallel variant of the custom matrix-vector product. I know
exactly how to write the parallel variant
(in plain MPI) but I am, somehow, blocked because it is not clear to me
what each task receives
and what is expected to provide in the parallel matrix-vector product.
More in detail, with a single task, the function receives the full X vector
and is expected to provide the full Y vector resulting from Y=A*X.
What does it happen with multiple tasks? If I understand correctly
in the matrix shell definition, I can choose to split the matrix into
blocks of rows so that the matrix-vector function should compute a block of
elements of the vector Y but does it receive only the corresponding subset
of the X (input vector)? (this is what I guess happens) and in output, does
each task return its subset of elements of Y as if it were the whole array
and then PetSc manages all the subsets? Is there anyone who has a working
example of a parallel matrix-vector product for matrix shell?
Thanks in advance for any help you can provide!
Mario
i
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220617/4ecd3a08/attachment.html>

From knepley at gmail.com  Fri Jun 17 10:33:30 2022
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 17 Jun 2022 11:33:30 -0400
Subject: [petsc-users] Customizing NASM subsnes
In-Reply-To: <DF2CCD96-56B3-445D-B324-05CD246B23A8@petsc.dev>
References: <62aba746.1c69fb81.7df46.678d@mx.google.com>
	<CAMYG4Gm7+vpJZ9dSDaivKYStsAA8zoZr1-gWKWh7ROKXsFWWRw@mail.gmail.com>
	<CAJOF-fn_QKBLCEeCACY9_UomBBJqms5WscBCoeinCzU+yXrkEQ@mail.gmail.com>
	<CAMYG4Gk7pBSSSZU3Y6Cx4Rncr9NVeh_vK=9z8j98yaW7uABYKA@mail.gmail.com>
	<E50DADD3-187C-4F3A-8DF7-C78055D7C785@petsc.dev>
	<CAJOF-fkWf5CBN5OzS5BGo_-Q4tvVSzOnuG8hqCu0EbwAEva-Bw@mail.gmail.com>
	<DF2CCD96-56B3-445D-B324-05CD246B23A8@petsc.dev>
Message-ID: <CAMYG4Gn5w2zK1Q5t0pEDwnbsvivPG5K79oMyVFqfh9xc=9M7fA@mail.gmail.com>

On Fri, Jun 17, 2022 at 11:02 AM Barry Smith <bsmith at petsc.dev> wrote:

>
>   You do not need the loop over size. Each rank sets options and options
> prefix for its local objects and never anyone elses.
>
>        char prefix[10];
>
>        sprintf(prefix,"sub_%d_",rank);
>
>        SNESNASMGetSNES(snes,0,&subsnes);
>
>          if (rank <size-1) {
>             SNESSetType(subsnes,SNESNEWTONLS); CHKERRQ(ierr); // newton
> for regular domains
>          } else {
>             SNESSetType(subsnes,SNESFAS); CHKERRQ(ierr); // fas for last
> domain
>          }
>          SNESSetOptionsPrefix(subsnes,prefix);
>       }
>    }
>
>
>    To get the prefix to work try calling   SNESSetFromOptions(subsnes);
> immediately after your SNESSetOptionsPrefix(subsnes,prefix); call
>
>    Matt, it looks like there may be a bug in NASM, except in one
> particular case, it never calls SNESSetFromOptions() on the subsenses.
>

I bet Barry is correct. I can fix it, but unfortunately, I leave for a week
long conference tomorrow, of which I am an organizer, so I don't
think I can do it for a week.

  Thanks,

     Matt


>   Barry
>
>
> On Jun 17, 2022, at 10:47 AM, Takahashi, Tadanaga <tt73 at njit.edu> wrote:
>
> Thank you. I am now able to pull each subsnes, change its snes type
> through the API, and set a prefix. This is my updated code:
>
>    SNES           snes, subsnes;
>    PetscMPIInt    rank, size;
>    ...
>    ierr = SNESCreate(PETSC_COMM_WORLD,&snes); CHKERRQ(ierr);
>    ierr = SNESSetType(snes,SNESNASM); CHKERRQ(ierr);
>    ierr = SNESNASMSetType(snes,PC_ASM_RESTRICT); CHKERRQ(ierr);
>    ...
>    ierr = SNESSetFromOptions(snes); CHKERRQ(ierr);
>    ierr = SNESSetUp(snes); CHKERRQ(ierr);
>    PetscPrintf(PETSC_COMM_WORLD, "Size = %d\n",size);
>    PetscBarrier(NULL);
>    for (i=0; i<size; i++) {
>       char prefix[10];
>       sprintf(prefix,"sub_%d_",i);
>       if(i==rank) {
>          ierr = SNESNASMGetNumber(snes,&Nd);
>          printf("rank = %d has %d block(s)\n",i,Nd);
>          if (i <size-1) {
>             SNESNASMGetSNES(snes,0,&subsnes);
>             SNESSetType(subsnes,SNESNEWTONLS); CHKERRQ(ierr); // newton
> for regular domains
>          } else {
>             SNESNASMGetSNES(snes,0,&subsnes);
>             SNESSetType(subsnes,SNESFAS); CHKERRQ(ierr); // fas for last
> domain
>          }
>          SNESSetOptionsPrefix(subsnes,prefix);
>       }
>    }
>    ierr = SNESSetFromOptions(snes); CHKERRQ(ierr);
>    ...
>    ierr = SNESSolve(snes,NULL,u_initial); CHKERRQ(ierr);
>
> However, I still cannot change SNES, KSP, and PC types for
> individual domains through the command arguments. I checked the subdomains
> with -snes_view ::ascii_info_detail and it does show that the prefixes
> are properly changed. It also shows that the SNES type for the last domain
> was successfully changed. But for some reason, I only have access to the
> SNES viewer options during runtime. For example, if I run mpiexec -n 4
> ./test1 -sub_0_ksp_type gmres -help | grep sub_0 I get the output:
>
> Viewer (-sub_0_snes_convergence_estimate) options:
>   -sub_0_snes_convergence_estimate ascii[:[filename][:[format][:append]]]:
> Prints object to stdout or ASCII file (PetscOptionsGetViewer)
>   -sub_0_snes_convergence_estimate
> binary[:[filename][:[format][:append]]]: Saves object to a binary file
> (PetscOptionsGetViewer)
>   -sub_0_snes_convergence_estimate draw[:[drawtype][:filename|format]]
> Draws object (PetscOptionsGetViewer)
>   -sub_0_snes_convergence_estimate socket[:port]: Pushes object to a Unix
> socket (PetscOptionsGetViewer)
>   -sub_0_snes_convergence_estimate saws[:communicatorname]: Publishes
> object to SAWs (PetscOptionsGetViewer)
> Viewer (-sub_0_snes_view_pre) options:
>   -sub_0_snes_view_pre ascii[:[filename][:[format][:append]]]: Prints
> object to stdout or ASCII file (PetscOptionsGetViewer)
>   -sub_0_snes_view_pre binary[:[filename][:[format][:append]]]: Saves
> object to a binary file (PetscOptionsGetViewer)
>   -sub_0_snes_view_pre draw[:[drawtype][:filename|format]] Draws object
> (PetscOptionsGetViewer)
>   -sub_0_snes_view_pre socket[:port]: Pushes object to a Unix socket
> (PetscOptionsGetViewer)
>   -sub_0_snes_view_pre saws[:communicatorname]: Publishes object to SAWs
> (PetscOptionsGetViewer)
> Viewer (-sub_0_snes_test_jacobian_view) options:
>   -sub_0_snes_test_jacobian_view ascii[:[filename][:[format][:append]]]:
> Prints object to stdout or ASCII file (PetscOptionsGetViewer)
>   -sub_0_snes_test_jacobian_view binary[:[filename][:[format][:append]]]:
> Saves object to a binary file (PetscOptionsGetViewer)
>   -sub_0_snes_test_jacobian_view draw[:[drawtype][:filename|format]] Draws
> object (PetscOptionsGetViewer)
>   -sub_0_snes_test_jacobian_view socket[:port]: Pushes object to a Unix
> socket (PetscOptionsGetViewer)
>   -sub_0_snes_test_jacobian_view saws[:communicatorname]: Publishes object
> to SAWs (PetscOptionsGetViewer)
> Viewer (-sub_0_snes_test_jacobian_display) options:
>   -sub_0_snes_test_jacobian_display
> ascii[:[filename][:[format][:append]]]: Prints object to stdout or ASCII
> file (PetscOptionsGetViewer)
>   -sub_0_snes_test_jacobian_display
> binary[:[filename][:[format][:append]]]: Saves object to a binary file
> (PetscOptionsGetViewer)
>   -sub_0_snes_test_jacobian_display draw[:[drawtype][:filename|format]]
> Draws object (PetscOptionsGetViewer)
>   -sub_0_snes_test_jacobian_display socket[:port]: Pushes object to a Unix
> socket (PetscOptionsGetViewer)
>   -sub_0_snes_test_jacobian_display saws[:communicatorname]: Publishes
> object to SAWs (PetscOptionsGetViewer)
> Viewer (-sub_0_ksp_converged_reason) options:
>   -sub_0_ksp_converged_reason ascii[:[filename][:[format][:append]]]:
> Prints object to stdout or ASCII file (PetscOptionsGetViewer)
>   -sub_0_ksp_converged_reason binary[:[filename][:[format][:append]]]:
> Saves object to a binary file (PetscOptionsGetViewer)
>   -sub_0_ksp_converged_reason draw[:[drawtype][:filename|format]] Draws
> object (PetscOptionsGetViewer)
>   -sub_0_ksp_converged_reason socket[:port]: Pushes object to a Unix
> socket (PetscOptionsGetViewer)
>   -sub_0_ksp_converged_reason saws[:communicatorname]: Publishes object to
> SAWs (PetscOptionsGetViewer)
> Viewer (-sub_0_snes_converged_reason) options:
>   -sub_0_snes_converged_reason ascii[:[filename][:[format][:append]]]:
> Prints object to stdout or ASCII file (PetscOptionsGetViewer)
>   -sub_0_snes_converged_reason binary[:[filename][:[format][:append]]]:
> Saves object to a binary file (PetscOptionsGetViewer)
>   -sub_0_snes_converged_reason draw[:[drawtype][:filename|format]] Draws
> object (PetscOptionsGetViewer)
>   -sub_0_snes_converged_reason socket[:port]: Pushes object to a Unix
> socket (PetscOptionsGetViewer)
>   -sub_0_snes_converged_reason saws[:communicatorname]: Publishes object
> to SAWs (PetscOptionsGetViewer)
> Viewer (-sub_0_snes_view) options:
>   -sub_0_snes_view ascii[:[filename][:[format][:append]]]: Prints object
> to stdout or ASCII file (PetscOptionsGetViewer)
>   -sub_0_snes_view binary[:[filename][:[format][:append]]]: Saves object
> to a binary file (PetscOptionsGetViewer)
>   -sub_0_snes_view draw[:[drawtype][:filename|format]] Draws object
> (PetscOptionsGetViewer)
>   -sub_0_snes_view socket[:port]: Pushes object to a Unix socket
> (PetscOptionsGetViewer)
>   -sub_0_snes_view saws[:communicatorname]: Publishes object to SAWs
> (PetscOptionsGetViewer)
> Viewer (-sub_0_snes_view_solution) options:
>   -sub_0_snes_view_solution ascii[:[filename][:[format][:append]]]: Prints
> object to stdout or ASCII file (PetscOptionsGetViewer)
>   -sub_0_snes_view_solution binary[:[filename][:[format][:append]]]: Saves
> object to a binary file (PetscOptionsGetViewer)
>   -sub_0_snes_view_solution draw[:[drawtype][:filename|format]] Draws
> object (PetscOptionsGetViewer)
>   -sub_0_snes_view_solution socket[:port]: Pushes object to a Unix socket
> (PetscOptionsGetViewer)
>   -sub_0_snes_view_solution saws[:communicatorname]: Publishes object to
> SAWs (PetscOptionsGetViewer)
> Option left: name:-sub_0_ksp_type value: gmres
>
>
> Do you know what could be causing this?
>
> On Fri, Jun 17, 2022 at 10:00 AM Barry Smith <bsmith at petsc.dev> wrote:
>
>>
>>      MPI_Comm_size(PETSC_COMM_WORLD,&size);
>>      MPI_Comm_rank(PETSC_COMM_WORLD,&rank);
>>
>> SNESNASMGetSNES(snes,0,&subsnes);
>>>  char prefix[10];
>>>  sprintf(prefix,"sub_%d_",rank);
>>>  SNESSetOptionsPrefix(subsnes,prefix);
>>>
>>
>>
>>
>> On Jun 17, 2022, at 9:35 AM, Matthew Knepley <knepley at gmail.com> wrote:
>>
>> On Fri, Jun 17, 2022 at 9:22 AM Takahashi, Tadanaga <tt73 at njit.edu>
>> wrote:
>>
>>> I'm having some trouble pulling out the subsolver. I tried to use
>>> SNESNASMGetSNES in a loop over each subdomain. However I get an error when
>>> I run the code with more than one MPI processors. Here is a snippet from my
>>> code:
>>>
>>>    SNES           snes, subsnes;
>>>    PetscMPIInt    rank, size;
>>>    ...
>>>    ierr = SNESCreate(PETSC_COMM_WORLD,&snes); CHKERRQ(ierr);
>>>    ierr = SNESSetType(snes,SNESNASM); CHKERRQ(ierr);
>>>    ierr = SNESNASMSetType(snes,PC_ASM_RESTRICT); CHKERRQ(ierr);
>>>    ...
>>>    ierr = SNESSetFromOptions(snes); CHKERRQ(ierr);
>>>    ierr = SNESSetUp(snes); CHKERRQ(ierr);
>>>    PetscPrintf(PETSC_COMM_WORLD, "Size = %d\n",size);
>>>    for (i=0; i<size; i++) {
>>>       PetscPrintf(PETSC_COMM_WORLD, "rank = %d\n",i);
>>>       SNESNASMGetSNES(snes,i,&subsnes);
>>>       // char prefix[10];
>>>       // sprintf(prefix,"sub_%d_",i);
>>>       // SNESSetOptionsPrefix(subsnes,prefix);
>>>    }
>>>    ...
>>>    ierr = SNESSolve(snes,NULL,u_initial); CHKERRQ(ierr);
>>>
>>>
>>> And, here is the output of the code when I run with 2 MPI procs:
>>>
>>
>> SNESNASMGetSNES() gets the local subsolvers. It seems you only have one
>> per process.
>> You can check
>> https://petsc.org/main/docs/manualpages/SNES/SNESNASMGetNumber/
>>
>> Notice that your current code will not work because, according to your
>> explanation, you only want to change
>> the prefix on a single rank, so you need to check the rank when you do it.
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> takahashi at ubuntu:~/Desktop/MA-DDM/C/Rectangle$ mpiexec -n 2 ./test1
>>> Size = 2
>>> rank = 0
>>> rank = 1
>>> [0]PETSC ERROR: --------------------- Error Message
>>> --------------------------------------------------------------
>>> [0]PETSC ERROR: Argument out of range
>>> [0]PETSC ERROR: No such subsolver
>>> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
>>> [0]PETSC ERROR: Petsc Release Version 3.17.1, unknown
>>> [0]PETSC ERROR: ./test1 on a linux-gnu-c-debug named ubuntu by takahashi
>>> Fri Jun 17 06:06:38 2022
>>> [0]PETSC ERROR: Configure options --with-mpi-dir=/usr --with-fc=0
>>> [0]PETSC ERROR: #1 SNESNASMGetSNES() at
>>> /home/takahashi/Desktop/petsc/src/snes/impls/nasm/nasm.c:923
>>>
>>>
>>> ===================================================================================
>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>> =   RANK 0 PID 976566 RUNNING AT ubuntu
>>> =   KILLED BY SIGNAL: 9 (Killed)
>>>
>>> ===================================================================================
>>>
>>> This error doesn't occur when I run this without MPI. However, I tried
>>> to change the prefix of the subdomain to `sub_0_` but I am not able to
>>> change the snes_type using this prefix. Running ./test1 -snes_view
>>> -help | grep sub_0_snes_type prints nothing.
>>>
>>> On Thu, Jun 16, 2022 at 6:23 PM Matthew Knepley <knepley at gmail.com>
>>> wrote:
>>>
>>>> On Thu, Jun 16, 2022 at 5:57 PM tt73 <tt73 at njit.edu> wrote:
>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> I am using  NASM as the outer solver for a nonlinear problem. For one
>>>>> of the subdomains, I want to run the local solve with a different set of
>>>>> options form the others. Is there any way to set options for each
>>>>> subdomain?
>>>>>
>>>>
>>>> I can see two ways:
>>>>
>>>>   1) Pull out the subsolver and set it using the API
>>>>
>>>>   2) Pull out the subsolver and give it a different prefix
>>>>
>>>>   Thanks,
>>>>
>>>>      Matt
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> https://www.cse.buffalo.edu/~knepley/
>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>
>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>>
>>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220617/d9269007/attachment-0001.html>

From jroman at dsic.upv.es  Fri Jun 17 10:33:58 2022
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Fri, 17 Jun 2022 17:33:58 +0200
Subject: [petsc-users] Parallel matrix-vector product with Matrix Shell
In-Reply-To: <CABcfRi2c+Jzw+o6PM0rN_vnCJ9_tdQ+zS2nqyK2mf_4f1jbaVA@mail.gmail.com>
References: <CABcfRi2c+Jzw+o6PM0rN_vnCJ9_tdQ+zS2nqyK2mf_4f1jbaVA@mail.gmail.com>
Message-ID: <3AAEAEF6-82A9-478F-BAD9-DC3AE6B0025C@dsic.upv.es>

You can use VecGetOwnershipRange() to determine the range of global indices corresponding to the local portion of a vector, and VecGetArray() to access the values. In SLEPc, you can assume that X and Y will have the same parallel distribution.

For an example of a shell matrix that implements the matrix-vector product in parallel, have a look at this: https://slepc.upv.es/documentation/current/src/nep/tutorials/ex21.c.html
It is a simple tridiagonal example, where neighborwise communication is done with two calls to MPI_Sendrecv().

Jose


> El 17 jun 2022, a las 17:21, Mario Rossi <dfatiac at gmail.com> escribi?:
> 
> I need to find the largest eigenvalues (say the first three) of a very large matrix and I am using
> a combination of PetSc and SLEPc. In particular, I am using a shell matrix. I wrote a "custom"
> matrix-vector product and everything works fine in serial (one task) mode for a "small" case.
> For the real case, I need multiple (at least 128) tasks for memory reasons so I need a parallel variant of the custom matrix-vector product. I know exactly how to write the parallel variant 
> (in plain MPI) but I am, somehow, blocked because it is not clear to me what each task receives
> and what is expected to provide in the parallel matrix-vector product.
> More in detail, with a single task, the function receives the full X vector and is expected to provide the full Y vector resulting from Y=A*X. 
> What does it happen with multiple tasks? If I understand correctly
> in the matrix shell definition, I can choose to split the matrix into blocks of rows so that the matrix-vector function should compute a block of elements of the vector Y but does it receive only the corresponding subset of the X (input vector)? (this is what I guess happens) and in output, does
> each task return its subset of elements of Y as if it were the whole array and then PetSc manages all the subsets? Is there anyone who has a working example of a parallel matrix-vector product for matrix shell?
> Thanks in advance for any help you can provide!
> Mario
> i
> 


From dfatiac at gmail.com  Fri Jun 17 10:56:34 2022
From: dfatiac at gmail.com (Mario Rossi)
Date: Fri, 17 Jun 2022 17:56:34 +0200
Subject: [petsc-users] Parallel matrix-vector product with Matrix Shell
In-Reply-To: <3AAEAEF6-82A9-478F-BAD9-DC3AE6B0025C@dsic.upv.es>
References: <CABcfRi2c+Jzw+o6PM0rN_vnCJ9_tdQ+zS2nqyK2mf_4f1jbaVA@mail.gmail.com>
	<3AAEAEF6-82A9-478F-BAD9-DC3AE6B0025C@dsic.upv.es>
Message-ID: <CABcfRi14EUfN7OwwW0U5kH+BXJdeDBMCgu83f8CQumh7OqPrZw@mail.gmail.com>

Thanks a lot, Jose!
I looked at the eps folder (where I found the test8.c that has been my
starting point) but I did not look at the nep folder (my fault!)
Thanks again,
Mario

Il giorno ven 17 giu 2022 alle ore 17:34 Jose E. Roman <jroman at dsic.upv.es>
ha scritto:

> You can use VecGetOwnershipRange() to determine the range of global
> indices corresponding to the local portion of a vector, and VecGetArray()
> to access the values. In SLEPc, you can assume that X and Y will have the
> same parallel distribution.
>
> For an example of a shell matrix that implements the matrix-vector product
> in parallel, have a look at this:
> https://slepc.upv.es/documentation/current/src/nep/tutorials/ex21.c.html
> It is a simple tridiagonal example, where neighborwise communication is
> done with two calls to MPI_Sendrecv().
>
> Jose
>
>
> > El 17 jun 2022, a las 17:21, Mario Rossi <dfatiac at gmail.com> escribi?:
> >
> > I need to find the largest eigenvalues (say the first three) of a very
> large matrix and I am using
> > a combination of PetSc and SLEPc. In particular, I am using a shell
> matrix. I wrote a "custom"
> > matrix-vector product and everything works fine in serial (one task)
> mode for a "small" case.
> > For the real case, I need multiple (at least 128) tasks for memory
> reasons so I need a parallel variant of the custom matrix-vector product. I
> know exactly how to write the parallel variant
> > (in plain MPI) but I am, somehow, blocked because it is not clear to me
> what each task receives
> > and what is expected to provide in the parallel matrix-vector product.
> > More in detail, with a single task, the function receives the full X
> vector and is expected to provide the full Y vector resulting from Y=A*X.
> > What does it happen with multiple tasks? If I understand correctly
> > in the matrix shell definition, I can choose to split the matrix into
> blocks of rows so that the matrix-vector function should compute a block of
> elements of the vector Y but does it receive only the corresponding subset
> of the X (input vector)? (this is what I guess happens) and in output, does
> > each task return its subset of elements of Y as if it were the whole
> array and then PetSc manages all the subsets? Is there anyone who has a
> working example of a parallel matrix-vector product for matrix shell?
> > Thanks in advance for any help you can provide!
> > Mario
> > i
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220617/cdbde0e3/attachment.html>

From leiyongxiang1205 at gmail.com  Fri Jun 17 12:52:34 2022
From: leiyongxiang1205 at gmail.com (Yongxiang Lei)
Date: Fri, 17 Jun 2022 18:52:34 +0100
Subject: [petsc-users] Pestc-matlab issue
Message-ID: <CACz5tugb0KsBCmbYkV=P78hdOZ0qUS_uwLyMDPCeioz+NyLxBQ@mail.gmail.com>

Dear concerns,

I met such issues when I  confirm that my Matlab installation is finished.
I am also sure that the g++ version is given. Could you please check this
problem for me?  The related information is given as follows.

Looking forward to hearing from you
Best regards
Xiang

>> mex -setup cpp
MEX configured to use 'g++' for C++ language compilation.
.........................................................................................................
[ADS+u2192020 at cos8-25136ecf ~]$ g++ --version
g++ (GCC) 8.5.0 20210514 (Red Hat 8.5.0-13)
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
.............................................................................................................
export PETSC_DIR=$PWD
export PETSC_ARCH=linux-debug
./configure \
  --CC=$HOME/sfw/linux/openmpi-4.0.2/bin/mpicc \
  --CXX=$HOME/sfw/linux/openmpi-4.0.2/bin/mpicxx \
  --FC=$HOME/sfw/linux/openmpi-4.0.2/bin/mpif90 \
  --with-debugging=1 \
  --download-hypre=1 \
  --download-fblaslapack=1 \
  --with-x=0\
  --with-matlab-dir=/home/ADS/u2192020/Matlab2022/ \
  --with-matlab-engine=1 \
  --with-matlab-engine-dir=/home/ADS/u2192020/Matlab2022/extern/engines/
=========================================
=========================================
Now to check if the libraries are working do:
make PETSC_DIR=/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB
PETSC_ARCH=linux-debug check
=========================================
[ADS+u2192020 at cos8-25136ecf PETSc-MATLAB]$ make -j4 check
Running check examples to verify correct installation
Using PETSC_DIR=/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB and
PETSC_ARCH=linux-debug
gmake[3]:
[/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/lib/petsc/conf/rules:320:
ex19.PETSc] Error 2 (ignored)
*******************Error detected during compile or link!*******************
See http://www.mcs.anl.gov/petsc/documentation/faq.html
/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/src/snes/tutorials ex19
*********************************************************************************
/home/ADS/u2192020/sfw/linux/openmpi-4.0.2/bin/mpicc -fPIC -Wall
-Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch
-fstack-protector -fvisibility=hidden -g3 -O0  -fPIC -Wall -Wwrite-strings
-Wno-unknown-pragmas -Wno-lto-type-mismatch -fstack-protector
-fvisibility=hidden -g3 -O0
 -I/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/include
-I/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/include
-I/home/ADS/u2192020/Matlab2022/extern/include     ex19.c
 -Wl,-rpath,/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/lib
-L/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/lib
-Wl,-rpath,/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/lib
-L/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/lib
-Wl,-rpath,/home/ADS/u2192020/Matlab2022/bin/glnxa64
-L/home/ADS/u2192020/Matlab2022/bin/glnxa64
-Wl,-rpath,/home/ADS/u2192020/Matlab2022/extern/lib/glnxa64
-L/home/ADS/u2192020/Matlab2022/extern/lib/glnxa64
-Wl,-rpath,/home/ADS/u2192020/sfw/linux/openmpi-4.0.2/lib
-L/home/ADS/u2192020/sfw/linux/openmpi-4.0.2/lib
-Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/8
-L/usr/lib/gcc/x86_64-redhat-linux/8 -lpetsc -lHYPRE -lflapack -lfblas -lm
-leng -lmex -lmx -lmat -lmwm_dispatcher -lmwopcmodel -lmwservices
-lmwopcmodel -lmwm_dispatcher -lmwmpath -lmwopcmodel -lmwservices
-lmwopcmodel -lmwservices -lxerces-c -lstdc++ -ldl -lmpi_usempif08
-lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm
-lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl -o ex19
//home/ADS/u2192020/Matlab2022/bin/glnxa64/libmwopccore.so: undefined
reference to `std::__cxx11::basic_ostringstream<wchar_t,
std::char_traits<wchar_t>, std::allocator<wchar_t>
>::basic_ostringstream()@GLIBCXX_3.4.26'
//home/ADS/u2192020/Matlab2022/bin/glnxa64/libmwfoundation_matlabdata_matlab.so:
undefined reference to
`std::logic_error::logic_error(std::logic_error&&)@GLIBCXX_3.4.26'
/home/ADS/u2192020/Matlab2022/bin/glnxa64/libmex.so: undefined reference to
`std::__cxx11::basic_ostringstream<char, std::char_traits<char>,
std::allocator<char> >::basic_ostringstream()@GLIBCXX_3.4.26'
/home/ADS/u2192020/Matlab2022/bin/glnxa64/libmx.so: undefined reference to
`std::__cxx11::basic_stringstream<char, std::char_traits<char>,
std::allocator<char> >::basic_stringstream()@GLIBCXX_3.4.26'
collect2: error: ld returned 1 exit status
gmake[4]: *** [<builtin>: ex19] Error 1
1,5c1,10
< lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
<   0 SNES Function norm 0.0406612
<   1 SNES Function norm 4.12227e-06
<   2 SNES Function norm 6.098e-11
< Number of SNES iterations = 2
---
> --------------------------------------------------------------------------
> mpiexec was unable to launch the specified application as it could not
access
> or execute an executable:
>
> Executable: ./ex19
> Node: cos8-25136ecf
>
> while attempting to start process rank 0.
> --------------------------------------------------------------------------
> 2 total processes failed to start
/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/src/snes/tutorials
Possible problem with ex19 running with hypre, diffs above
=========================================
gmake[3]:
[/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/lib/petsc/conf/rules:374:
ex5f.PETSc] Error 2 (ignored)
*******************Error detected during compile or link!*******************
See http://www.mcs.anl.gov/petsc/documentation/faq.html
/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/src/snes/tutorials ex5f
*********************************************************
/home/ADS/u2192020/sfw/linux/openmpi-4.0.2/bin/mpif90 -fPIC -Wall
-ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -g
-O0   -fPIC -Wall -ffree-line-length-0 -Wno-lto-type-mismatch
-Wno-unused-dummy-argument -g -O0
 -I/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/include
-I/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/include
-I/home/ADS/u2192020/Matlab2022/extern/include     ex5f.F90
 -Wl,-rpath,/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/lib
-L/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/lib
-Wl,-rpath,/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/lib
-L/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/lib
-Wl,-rpath,/home/ADS/u2192020/Matlab2022/bin/glnxa64
-L/home/ADS/u2192020/Matlab2022/bin/glnxa64
-Wl,-rpath,/home/ADS/u2192020/Matlab2022/extern/lib/glnxa64
-L/home/ADS/u2192020/Matlab2022/extern/lib/glnxa64
-Wl,-rpath,/home/ADS/u2192020/sfw/linux/openmpi-4.0.2/lib
-L/home/ADS/u2192020/sfw/linux/openmpi-4.0.2/lib
-Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/8
-L/usr/lib/gcc/x86_64-redhat-linux/8 -lpetsc -lHYPRE -lflapack -lfblas -lm
-leng -lmex -lmx -lmat -lmwm_dispatcher -lmwopcmodel -lmwservices
-lmwopcmodel -lmwm_dispatcher -lmwmpath -lmwopcmodel -lmwservices
-lmwopcmodel -lmwservices -lxerces-c -lstdc++ -ldl -lmpi_usempif08
-lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm
-lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl -o ex5f
//home/ADS/u2192020/Matlab2022/bin/glnxa64/libmwopccore.so: undefined
reference to `std::__cxx11::basic_ostringstream<wchar_t,
std::char_traits<wchar_t>, std::allocator<wchar_t>
>::basic_ostringstream()@GLIBCXX_3.4.26'
//home/ADS/u2192020/Matlab2022/bin/glnxa64/libmwfoundation_matlabdata_matlab.so:
undefined reference to
`std::logic_error::logic_error(std::logic_error&&)@GLIBCXX_3.4.26'
/home/ADS/u2192020/Matlab2022/bin/glnxa64/libmex.so: undefined reference to
`std::__cxx11::basic_ostringstream<char, std::char_traits<char>,
std::allocator<char> >::basic_ostringstream()@GLIBCXX_3.4.26'
/home/ADS/u2192020/Matlab2022/bin/glnxa64/libmx.so: undefined reference to
`std::__cxx11::basic_stringstream<char, std::char_traits<char>,
std::allocator<char> >::basic_stringstream()@GLIBCXX_3.4.26'
collect2: error: ld returned 1 exit status
gmake[4]: ***
[/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/lib/petsc/conf/test:43: ex5f]
Error 1
gmake[3]:
[/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/lib/petsc/conf/rules:320:
ex31.PETSc] Error 2 (ignored)
*******************Error detected during compile or link!*******************
See http://www.mcs.anl.gov/petsc/documentation/faq.html
/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/src/vec/vec/tutorials ex31
*********************************************************************************
/home/ADS/u2192020/sfw/linux/openmpi-4.0.2/bin/mpicc -fPIC -Wall
-Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch
-fstack-protector -fvisibility=hidden -g3 -O0  -fPIC -Wall -Wwrite-strings
-Wno-unknown-pragmas -Wno-lto-type-mismatch -fstack-protector
-fvisibility=hidden -g3 -O0
 -I/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/include
-I/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/include
-I/home/ADS/u2192020/Matlab2022/extern/include     ex31.c
 -Wl,-rpath,/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/lib
-L/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/lib
-Wl,-rpath,/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/lib
-L/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/lib
-Wl,-rpath,/home/ADS/u2192020/Matlab2022/bin/glnxa64
-L/home/ADS/u2192020/Matlab2022/bin/glnxa64
-Wl,-rpath,/home/ADS/u2192020/Matlab2022/extern/lib/glnxa64
-L/home/ADS/u2192020/Matlab2022/extern/lib/glnxa64
-Wl,-rpath,/home/ADS/u2192020/sfw/linux/openmpi-4.0.2/lib
-L/home/ADS/u2192020/sfw/linux/openmpi-4.0.2/lib
-Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/8
-L/usr/lib/gcc/x86_64-redhat-linux/8 -lpetsc -lHYPRE -lflapack -lfblas -lm
-leng -lmex -lmx -lmat -lmwm_dispatcher -lmwopcmodel -lmwservices
-lmwopcmodel -lmwm_dispatcher -lmwmpath -lmwopcmodel -lmwservices
-lmwopcmodel -lmwservices -lxerces-c -lstdc++ -ldl -lmpi_usempif08
-lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm
-lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl -o ex31
//home/ADS/u2192020/Matlab2022/bin/glnxa64/libmwopccore.so: undefined
reference to `std::__cxx11::basic_ostringstream<wchar_t,
std::char_traits<wchar_t>, std::allocator<wchar_t>
>::basic_ostringstream()@GLIBCXX_3.4.26'
//home/ADS/u2192020/Matlab2022/bin/glnxa64/libmwfoundation_matlabdata_matlab.so:
undefined reference to
`std::logic_error::logic_error(std::logic_error&&)@GLIBCXX_3.4.26'
/home/ADS/u2192020/Matlab2022/bin/glnxa64/libmex.so: undefined reference to
`std::__cxx11::basic_ostringstream<char, std::char_traits<char>,
std::allocator<char> >::basic_ostringstream()@GLIBCXX_3.4.26'
/home/ADS/u2192020/Matlab2022/bin/glnxa64/libmx.so: undefined reference to
`std::__cxx11::basic_stringstream<char, std::char_traits<char>,
std::allocator<char> >::basic_stringstream()@GLIBCXX_3.4.26'
collect2: error: ld returned 1 exit status
gmake[4]: *** [<builtin>: ex31] Error 1
Completed test examples
[ADS+u2192020 at cos8-25136ecf PETSc-MATLAB]$
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220617/04823f8c/attachment.html>

From bsmith at petsc.dev  Fri Jun 17 13:07:19 2022
From: bsmith at petsc.dev (Barry Smith)
Date: Fri, 17 Jun 2022 14:07:19 -0400
Subject: [petsc-users] Pestc-matlab issue
In-Reply-To: <CACz5tugb0KsBCmbYkV=P78hdOZ0qUS_uwLyMDPCeioz+NyLxBQ@mail.gmail.com>
References: <CACz5tugb0KsBCmbYkV=P78hdOZ0qUS_uwLyMDPCeioz+NyLxBQ@mail.gmail.com>
Message-ID: <246D1F71-9210-4DEA-B8CC-C999EC7C5B00@petsc.dev>


  Please send configure.log and make.log to petsc-maint at mcs.anl.gov <mailto:petsc-maint at mcs.anl.gov> 



> On Jun 17, 2022, at 1:52 PM, Yongxiang Lei <leiyongxiang1205 at gmail.com> wrote:
> 
> Dear concerns,
> 
> I met such issues when I  confirm that my Matlab installation is finished. I am also sure that the g++ version is given. Could you please check this problem for me?  The related information is given as follows. 
> 
> Looking forward to hearing from you
> Best regards
> Xiang
> 
> >> mex -setup cpp
> MEX configured to use 'g++' for C++ language compilation.
> .........................................................................................................
> [ADS+u2192020 at cos8-25136ecf ~]$ g++ --version
> g++ (GCC) 8.5.0 20210514 (Red Hat 8.5.0-13)
> Copyright (C) 2018 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> .............................................................................................................
> export PETSC_DIR=$PWD
> export PETSC_ARCH=linux-debug
> ./configure \
>   --CC=$HOME/sfw/linux/openmpi-4.0.2/bin/mpicc \
>   --CXX=$HOME/sfw/linux/openmpi-4.0.2/bin/mpicxx \
>   --FC=$HOME/sfw/linux/openmpi-4.0.2/bin/mpif90 \
>   --with-debugging=1 \
>   --download-hypre=1 \
>   --download-fblaslapack=1 \
>   --with-x=0\
>   --with-matlab-dir=/home/ADS/u2192020/Matlab2022/ \
>   --with-matlab-engine=1 \
>   --with-matlab-engine-dir=/home/ADS/u2192020/Matlab2022/extern/engines/
> =========================================
> =========================================
> Now to check if the libraries are working do:
> make PETSC_DIR=/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB PETSC_ARCH=linux-debug check
> =========================================
> [ADS+u2192020 at cos8-25136ecf PETSc-MATLAB]$ make -j4 check
> Running check examples to verify correct installation
> Using PETSC_DIR=/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB and PETSC_ARCH=linux-debug
> gmake[3]: [/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/lib/petsc/conf/rules:320: ex19.PETSc] Error 2 (ignored)
> *******************Error detected during compile or link!*******************
> See http://www.mcs.anl.gov/petsc/documentation/faq.html <http://www.mcs.anl.gov/petsc/documentation/faq.html>
> /home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/src/snes/tutorials ex19
> *********************************************************************************
> /home/ADS/u2192020/sfw/linux/openmpi-4.0.2/bin/mpicc -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch -fstack-protector -fvisibility=hidden -g3 -O0  -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch -fstack-protector -fvisibility=hidden -g3 -O0    -I/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/include -I/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/include -I/home/ADS/u2192020/Matlab2022/extern/include     ex19.c  -Wl,-rpath,/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/lib -L/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/lib -Wl,-rpath,/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/lib -L/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/lib -Wl,-rpath,/home/ADS/u2192020/Matlab2022/bin/glnxa64 -L/home/ADS/u2192020/Matlab2022/bin/glnxa64 -Wl,-rpath,/home/ADS/u2192020/Matlab2022/extern/lib/glnxa64 -L/home/ADS/u2192020/Matlab2022/extern/lib/glnxa64 -Wl,-rpath,/home/ADS/u2192020/sfw/linux/openmpi-4.0.2/lib -L/home/ADS/u2192020/sfw/linux/openmpi-4.0.2/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/8 -L/usr/lib/gcc/x86_64-redhat-linux/8 -lpetsc -lHYPRE -lflapack -lfblas -lm -leng -lmex -lmx -lmat -lmwm_dispatcher -lmwopcmodel -lmwservices -lmwopcmodel -lmwm_dispatcher -lmwmpath -lmwopcmodel -lmwservices -lmwopcmodel -lmwservices -lxerces-c -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl -o ex19
> //home/ADS/u2192020/Matlab2022/bin/glnxa64/libmwopccore.so: undefined reference to `std::__cxx11::basic_ostringstream<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >::basic_ostringstream()@GLIBCXX_3.4.26'
> //home/ADS/u2192020/Matlab2022/bin/glnxa64/libmwfoundation_matlabdata_matlab.so: undefined reference to `std::logic_error::logic_error(std::logic_error&&)@GLIBCXX_3.4.26'
> /home/ADS/u2192020/Matlab2022/bin/glnxa64/libmex.so: undefined reference to `std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >::basic_ostringstream()@GLIBCXX_3.4.26'
> /home/ADS/u2192020/Matlab2022/bin/glnxa64/libmx.so: undefined reference to `std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >::basic_stringstream()@GLIBCXX_3.4.26'
> collect2: error: ld returned 1 exit status
> gmake[4]: *** [<builtin>: ex19] Error 1
> 1,5c1,10
> < lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
> <   0 SNES Function norm 0.0406612 
> <   1 SNES Function norm 4.12227e-06 
> <   2 SNES Function norm 6.098e-11 
> < Number of SNES iterations = 2
> ---
> > --------------------------------------------------------------------------
> > mpiexec was unable to launch the specified application as it could not access
> > or execute an executable:
> > 
> > Executable: ./ex19
> > Node: cos8-25136ecf
> > 
> > while attempting to start process rank 0.
> > --------------------------------------------------------------------------
> > 2 total processes failed to start
> /home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/src/snes/tutorials
> Possible problem with ex19 running with hypre, diffs above
> =========================================
> gmake[3]: [/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/lib/petsc/conf/rules:374: ex5f.PETSc] Error 2 (ignored)
> *******************Error detected during compile or link!*******************
> See http://www.mcs.anl.gov/petsc/documentation/faq.html <http://www.mcs.anl.gov/petsc/documentation/faq.html>
> /home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/src/snes/tutorials ex5f
> *********************************************************
> /home/ADS/u2192020/sfw/linux/openmpi-4.0.2/bin/mpif90 -fPIC -Wall -ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -g -O0   -fPIC -Wall -ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -g -O0    -I/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/include -I/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/include -I/home/ADS/u2192020/Matlab2022/extern/include     ex5f.F90  -Wl,-rpath,/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/lib -L/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/lib -Wl,-rpath,/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/lib -L/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/lib -Wl,-rpath,/home/ADS/u2192020/Matlab2022/bin/glnxa64 -L/home/ADS/u2192020/Matlab2022/bin/glnxa64 -Wl,-rpath,/home/ADS/u2192020/Matlab2022/extern/lib/glnxa64 -L/home/ADS/u2192020/Matlab2022/extern/lib/glnxa64 -Wl,-rpath,/home/ADS/u2192020/sfw/linux/openmpi-4.0.2/lib -L/home/ADS/u2192020/sfw/linux/openmpi-4.0.2/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/8 -L/usr/lib/gcc/x86_64-redhat-linux/8 -lpetsc -lHYPRE -lflapack -lfblas -lm -leng -lmex -lmx -lmat -lmwm_dispatcher -lmwopcmodel -lmwservices -lmwopcmodel -lmwm_dispatcher -lmwmpath -lmwopcmodel -lmwservices -lmwopcmodel -lmwservices -lxerces-c -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl -o ex5f
> //home/ADS/u2192020/Matlab2022/bin/glnxa64/libmwopccore.so: undefined reference to `std::__cxx11::basic_ostringstream<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >::basic_ostringstream()@GLIBCXX_3.4.26'
> //home/ADS/u2192020/Matlab2022/bin/glnxa64/libmwfoundation_matlabdata_matlab.so: undefined reference to `std::logic_error::logic_error(std::logic_error&&)@GLIBCXX_3.4.26'
> /home/ADS/u2192020/Matlab2022/bin/glnxa64/libmex.so: undefined reference to `std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >::basic_ostringstream()@GLIBCXX_3.4.26'
> /home/ADS/u2192020/Matlab2022/bin/glnxa64/libmx.so: undefined reference to `std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >::basic_stringstream()@GLIBCXX_3.4.26'
> collect2: error: ld returned 1 exit status
> gmake[4]: *** [/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/lib/petsc/conf/test:43: ex5f] Error 1
> gmake[3]: [/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/lib/petsc/conf/rules:320: ex31.PETSc] Error 2 (ignored)
> *******************Error detected during compile or link!*******************
> See http://www.mcs.anl.gov/petsc/documentation/faq.html <http://www.mcs.anl.gov/petsc/documentation/faq.html>
> /home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/src/vec/vec/tutorials ex31
> *********************************************************************************
> /home/ADS/u2192020/sfw/linux/openmpi-4.0.2/bin/mpicc -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch -fstack-protector -fvisibility=hidden -g3 -O0  -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch -fstack-protector -fvisibility=hidden -g3 -O0    -I/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/include -I/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/include -I/home/ADS/u2192020/Matlab2022/extern/include     ex31.c  -Wl,-rpath,/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/lib -L/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/lib -Wl,-rpath,/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/lib -L/home/ADS/u2192020/sfw/petsc/PETSc-MATLAB/linux-debug/lib -Wl,-rpath,/home/ADS/u2192020/Matlab2022/bin/glnxa64 -L/home/ADS/u2192020/Matlab2022/bin/glnxa64 -Wl,-rpath,/home/ADS/u2192020/Matlab2022/extern/lib/glnxa64 -L/home/ADS/u2192020/Matlab2022/extern/lib/glnxa64 -Wl,-rpath,/home/ADS/u2192020/sfw/linux/openmpi-4.0.2/lib -L/home/ADS/u2192020/sfw/linux/openmpi-4.0.2/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/8 -L/usr/lib/gcc/x86_64-redhat-linux/8 -lpetsc -lHYPRE -lflapack -lfblas -lm -leng -lmex -lmx -lmat -lmwm_dispatcher -lmwopcmodel -lmwservices -lmwopcmodel -lmwm_dispatcher -lmwmpath -lmwopcmodel -lmwservices -lmwopcmodel -lmwservices -lxerces-c -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl -o ex31
> //home/ADS/u2192020/Matlab2022/bin/glnxa64/libmwopccore.so: undefined reference to `std::__cxx11::basic_ostringstream<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >::basic_ostringstream()@GLIBCXX_3.4.26'
> //home/ADS/u2192020/Matlab2022/bin/glnxa64/libmwfoundation_matlabdata_matlab.so: undefined reference to `std::logic_error::logic_error(std::logic_error&&)@GLIBCXX_3.4.26'
> /home/ADS/u2192020/Matlab2022/bin/glnxa64/libmex.so: undefined reference to `std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >::basic_ostringstream()@GLIBCXX_3.4.26'
> /home/ADS/u2192020/Matlab2022/bin/glnxa64/libmx.so: undefined reference to `std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >::basic_stringstream()@GLIBCXX_3.4.26'
> collect2: error: ld returned 1 exit status
> gmake[4]: *** [<builtin>: ex31] Error 1
> Completed test examples
> [ADS+u2192020 at cos8-25136ecf PETSc-MATLAB]$ 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220617/78a2c7d9/attachment-0001.html>

From dfatiac at gmail.com  Sat Jun 18 01:13:55 2022
From: dfatiac at gmail.com (Mario Rossi)
Date: Sat, 18 Jun 2022 08:13:55 +0200
Subject: [petsc-users] Parallel matrix-vector product with Matrix Shell
In-Reply-To: <CABcfRi14EUfN7OwwW0U5kH+BXJdeDBMCgu83f8CQumh7OqPrZw@mail.gmail.com>
References: <CABcfRi2c+Jzw+o6PM0rN_vnCJ9_tdQ+zS2nqyK2mf_4f1jbaVA@mail.gmail.com>
	<3AAEAEF6-82A9-478F-BAD9-DC3AE6B0025C@dsic.upv.es>
	<CABcfRi14EUfN7OwwW0U5kH+BXJdeDBMCgu83f8CQumh7OqPrZw@mail.gmail.com>
Message-ID: <CABcfRi1Xuz86JxhEBH3YGTGRSxgh6j3FhKTN40vG9Y7Btxu5nQ@mail.gmail.com>

Dear Jose and Petsc users, I implemented the parallel matrix-vector product
and it works meaning that it produces a result but it is different from the
result produced with a single task.
Obviously, I could be wrong in my implementation but what puzzles me is
that the *input *vector (x) to the product is different running with one
and two tasks and this is from the very first iteration (so it can not be
due to a previous error in the product).
I checked that X is different with one and two tasks with the following
(naive) code
PetscErrorCode MatMult_TM(Mat A,Vec x,Vec y) {
  void              *ctx;
  PetscInt          nx /* ,lo,i,j*/;
  const PetscScalar *px;
  PetscScalar       *py;
  MPI_Comm          comm;
  PetscFunctionBeginUser;
  PetscCall(MatShellGetContext(A,&ctx));
  PetscCall(VecGetLocalSize(x,&nx));
  PetscCall(PetscObjectGetComm((PetscObject)A,&comm));

  //  nx = *(int*)ctx;
  PetscCall(VecGetArrayRead(x,&px));
  PetscCall(VecGetArray(y,&py));

  for(int i=0;i<nx;i++){ printf("task %d, px[%d]=%f,
w[%d]=%f\n",myrank,i+offset,px[i],i+offset,w[i+offset]); }
  PetscCall(MPI_Barrier(comm));
  exit(0);
 ......
}

Then I reordered the output obtained with one and two tasks. The first part
of the x vector is very similar (but not exactly the same) using one and
two tasks but the second part (belonging to the second task) is pretty
different
(here "offset" is   offset=(n/size)*myrank;)
I create the matrix shell with

PetscCall(MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,N,N,&n,&A));
I am sure I am doing something wrong but I don't know what I need to look
at.
Thanks in advance!
Mario


Il giorno ven 17 giu 2022 alle ore 17:56 Mario Rossi <dfatiac at gmail.com> ha
scritto:

> Thanks a lot, Jose!
> I looked at the eps folder (where I found the test8.c that has been my
> starting point) but I did not look at the nep folder (my fault!)
> Thanks again,
> Mario
>
> Il giorno ven 17 giu 2022 alle ore 17:34 Jose E. Roman <jroman at dsic.upv.es>
> ha scritto:
>
>> You can use VecGetOwnershipRange() to determine the range of global
>> indices corresponding to the local portion of a vector, and VecGetArray()
>> to access the values. In SLEPc, you can assume that X and Y will have the
>> same parallel distribution.
>>
>> For an example of a shell matrix that implements the matrix-vector
>> product in parallel, have a look at this:
>> https://slepc.upv.es/documentation/current/src/nep/tutorials/ex21.c.html
>> It is a simple tridiagonal example, where neighborwise communication is
>> done with two calls to MPI_Sendrecv().
>>
>> Jose
>>
>>
>> > El 17 jun 2022, a las 17:21, Mario Rossi <dfatiac at gmail.com> escribi?:
>> >
>> > I need to find the largest eigenvalues (say the first three) of a very
>> large matrix and I am using
>> > a combination of PetSc and SLEPc. In particular, I am using a shell
>> matrix. I wrote a "custom"
>> > matrix-vector product and everything works fine in serial (one task)
>> mode for a "small" case.
>> > For the real case, I need multiple (at least 128) tasks for memory
>> reasons so I need a parallel variant of the custom matrix-vector product. I
>> know exactly how to write the parallel variant
>> > (in plain MPI) but I am, somehow, blocked because it is not clear to me
>> what each task receives
>> > and what is expected to provide in the parallel matrix-vector product.
>> > More in detail, with a single task, the function receives the full X
>> vector and is expected to provide the full Y vector resulting from Y=A*X.
>> > What does it happen with multiple tasks? If I understand correctly
>> > in the matrix shell definition, I can choose to split the matrix into
>> blocks of rows so that the matrix-vector function should compute a block of
>> elements of the vector Y but does it receive only the corresponding subset
>> of the X (input vector)? (this is what I guess happens) and in output, does
>> > each task return its subset of elements of Y as if it were the whole
>> array and then PetSc manages all the subsets? Is there anyone who has a
>> working example of a parallel matrix-vector product for matrix shell?
>> > Thanks in advance for any help you can provide!
>> > Mario
>> > i
>> >
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220618/c50f3142/attachment.html>

From yangzongze at gmail.com  Sat Jun 18 01:16:08 2022
From: yangzongze at gmail.com (Zongze Yang)
Date: Sat, 18 Jun 2022 14:16:08 +0800
Subject: [petsc-users] How to find the map between the high order
 coordinates of DMPlex and vertex numbering?
In-Reply-To: <CA+K_gXABGuMWmHrGA-Er4OH2FzNWHom7=0nPgBQJkbpo3cShdw@mail.gmail.com>
References: <CAMYG4Gkd7n7p6Ph+w8kN0DOxRFaxBr1sm6PFc15CP+5BT4uGww@mail.gmail.com>
	<2640A1A9-101C-4DFB-BFA4-C64AF231732A@gmail.com>
	<CAMYG4GkV1gCOL=zMKv+8FP7DXh95TLO234Du-mozycfWAVTvhA@mail.gmail.com>
	<CA+K_gXABGuMWmHrGA-Er4OH2FzNWHom7=0nPgBQJkbpo3cShdw@mail.gmail.com>
Message-ID: <CA+K_gXBz4vVxZd19ntFrkWrVgNc_CS309FqRyv2DQoa7axP-Rw@mail.gmail.com>

In order to check if I made mistakes in the python code, I try to use c
code to show the issue on DMProjectCoordinates. The code and mesh file is
attached.
If the code is correct, there must be something wrong with
`DMProjectCoordinates` or `DMPlexCreateGmshFromFile` for high-order mesh.

The command and the output are listed below: (Obviously the bounding box is
changed.)
```
$ ./test_gmsh_load_2rd -filename cube-p2.msh -old_fe_view -new_fe_view
Old Bounding Box:
  0: lo = 0. hi = 1.
  1: lo = 0. hi = 1.
  2: lo = 0. hi = 1.
PetscFE Object: OldCoordinatesFE 1 MPI processes
  type: basic
  Basic Finite Element in 3 dimensions with 3 components
  PetscSpace Object: P2 1 MPI processes
    type: sum
    Space in 3 variables with 3 components, size 30
    Sum space of 3 concatenated subspaces (all identical)
      PetscSpace Object: sum component (sumcomp_) 1 MPI processes
        type: poly
        Space in 3 variables with 1 components, size 10
        Polynomial space of degree 2
  PetscDualSpace Object: P2 1 MPI processes
    type: lagrange
    Dual space with 3 components, size 30
    Discontinuous Lagrange dual space
    Quadrature of order 5 on 27 points (dim 3)
PetscFE Object: NewCoordinatesFE 1 MPI processes
  type: basic
  Basic Finite Element in 3 dimensions with 3 components
  PetscSpace Object: P2 1 MPI processes
    type: sum
    Space in 3 variables with 3 components, size 30
    Sum space of 3 concatenated subspaces (all identical)
      PetscSpace Object: sum component (sumcomp_) 1 MPI processes
        type: poly
        Space in 3 variables with 1 components, size 10
        Polynomial space of degree 2
  PetscDualSpace Object: P2 1 MPI processes
    type: lagrange
    Dual space with 3 components, size 30
    Continuous Lagrange dual space
    Quadrature of order 5 on 27 points (dim 3)
New Bounding Box:
  0: lo = 2.5624e-17 hi = 8.
  1: lo = -9.23372e-17 hi = 7.
  2: lo = 2.72091e-17 hi = 8.5
```

Thanks,
Zongze

Zongze Yang <yangzongze at gmail.com> ?2022?6?17??? 14:54???

> I tried the projection operation. However, it seems that the projection
> gives the wrong solution. After projection, the bounding box is changed!
> See logs below.
>
> First, I patch the petsc4py by adding `DMProjectCoordinates`:
> ```
> diff --git a/src/binding/petsc4py/src/PETSc/DM.pyx
> b/src/binding/petsc4py/src/PETSc/DM.pyx
> index d8a58d183a..dbcdb280f1 100644
> --- a/src/binding/petsc4py/src/PETSc/DM.pyx
> +++ b/src/binding/petsc4py/src/PETSc/DM.pyx
> @@ -307,6 +307,12 @@ cdef class DM(Object):
>          PetscINCREF(c.obj)
>          return c
>
> +    def projectCoordinates(self, FE fe=None):
> +        if fe is None:
> +            CHKERR( DMProjectCoordinates(self.dm, NULL) )
> +        else:
> +            CHKERR( DMProjectCoordinates(self.dm, fe.fe) )
> +
>      def getBoundingBox(self):
>          cdef PetscInt i,dim=0
>          CHKERR( DMGetCoordinateDim(self.dm, &dim) )
> diff --git a/src/binding/petsc4py/src/PETSc/petscdm.pxi
> b/src/binding/petsc4py/src/PETSc/petscdm.pxi
> index 514b6fa472..c778e39884 100644
> --- a/src/binding/petsc4py/src/PETSc/petscdm.pxi
> +++ b/src/binding/petsc4py/src/PETSc/petscdm.pxi
> @@ -90,6 +90,7 @@ cdef extern from * nogil:
>      int DMGetCoordinateDim(PetscDM,PetscInt*)
>      int DMSetCoordinateDim(PetscDM,PetscInt)
>      int DMLocalizeCoordinates(PetscDM)
> +    int DMProjectCoordinates(PetscDM, PetscFE)
>
>      int DMCreateInterpolation(PetscDM,PetscDM,PetscMat*,PetscVec*)
>      int DMCreateInjection(PetscDM,PetscDM,PetscMat*)
> ```
>
> Then in python, I load a mesh and project the coordinates to P2:
> ```
> import firedrake as fd
> from firedrake.petsc import PETSc
>
> # plex = fd.mesh._from_gmsh('test-fd-load-p2.msh')
> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh')
> print('old bbox:', plex.getBoundingBox())
>
> dim = plex.getDimension()
> #                                 (dim,  nc, isSimplex, k,
>  qorder, comm=None)
> fe_new = PETSc.FE().createLagrange(dim, dim,      True, 2, PETSc.DETERMINE)
> plex.projectCoordinates(fe_new)
> fe_new.view()
>
> print('new bbox:', plex.getBoundingBox())
> ```
>
> The output is (The bounding box is changed!)
> ```
>
> old bbox: ((0.0, 1.0), (0.0, 1.0), (0.0, 1.0))
> PetscFE Object: P2 1 MPI processes
>   type: basic
>   Basic Finite Element in 3 dimensions with 3 components
>   PetscSpace Object: P2 1 MPI processes
>     type: sum
>     Space in 3 variables with 3 components, size 30
>     Sum space of 3 concatenated subspaces (all identical)
>       PetscSpace Object: sum component (sumcomp_) 1 MPI processes
>         type: poly
>         Space in 3 variables with 1 components, size 10
>         Polynomial space of degree 2
>   PetscDualSpace Object: P2 1 MPI processes
>     type: lagrange
>     Dual space with 3 components, size 30
>     Continuous Lagrange dual space
>     Quadrature of order 5 on 27 points (dim 3)
> new bbox: ((-6.530133708576188e-17, 36.30670832662781), (-3.899962995254311e-17, 36.2406171632539), (-8.8036464152166e-17, 36.111577025012224))
>
> ```
>
>
> By the way, for the original DG coordinates, where can I find the relation of the closure and the order of the dofs for the cell?
>
>
> Thanks!
>
>
>   Zongze
>
>
>
> Matthew Knepley <knepley at gmail.com> ?2022?6?17??? 01:11???
>
>> On Thu, Jun 16, 2022 at 12:06 PM Zongze Yang <yangzongze at gmail.com>
>> wrote:
>>
>>>
>>>
>>> ? 2022?6?16??23:22?Matthew Knepley <knepley at gmail.com> ???
>>>
>>> ?
>>> On Thu, Jun 16, 2022 at 11:11 AM Zongze Yang <yangzongze at gmail.com>
>>> wrote:
>>>
>>>> Hi, if I load a `gmsh` file with second-order elements, the coordinates
>>>> will be stored in a DG-P2 space. After obtaining the coordinates of a cell,
>>>> how can I map the coordinates to vertex and edge?
>>>>
>>>
>>> By default, they are stored as P2, not DG.
>>>
>>>
>>> I checked the coordinates vector, and found the dogs only defined on
>>> cell other than vertex and edge, so I said they are stored as DG.
>>> Then the function DMPlexVecGetClosure
>>> <https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexVecGetClosure/> seems return
>>> the coordinates in lex order.
>>>
>>> Some code in reading gmsh file reads that
>>>
>>>
>>> 1756:     if (isSimplex) continuity = PETSC_FALSE
>>> <https://petsc.org/main/docs/manualpages/Sys/PETSC_FALSE/>; /* XXX
>>> FIXME Requires DMPlexSetClosurePermutationLexicographic() */
>>>
>>>
>>> 1758:     GmshCreateFE(comm, NULL, isSimplex, continuity, nodeType,
>>> dim, coordDim, order, &fe)
>>>
>>>
>>> The continuity is set to false for simplex.
>>>
>>
>> Oh, yes. That needs to be fixed. For now, you can just project it to P2
>> if you want using
>>
>>   https://petsc.org/main/docs/manualpages/DM/DMProjectCoordinates/
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> Thanks,
>>> Zongze
>>>
>>> You can ask for the coordinates of a vertex or an edge directly using
>>>
>>>   https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexPointLocalRead/
>>>
>>> by giving the vertex or edge point. You can get all the coordinates on a
>>> cell, in the closure order, using
>>>
>>>   https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexVecGetClosure/
>>>
>>>   Thanks,
>>>
>>>      Matt
>>>
>>>
>>>> Below is some code load the gmsh file, I want to know the relation
>>>> between `cl` and `cell_coords`.
>>>>
>>>> ```
>>>> import firedrake as fd
>>>> import numpy as np
>>>>
>>>> # Load gmsh file (2rd)
>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh')
>>>>
>>>> cs, ce = plex.getHeightStratum(0)
>>>>
>>>> cdm = plex.getCoordinateDM()
>>>> csec = dm.getCoordinateSection()
>>>> coords_gvec = dm.getCoordinates()
>>>>
>>>> for i in range(cs, ce):
>>>>     cell_coords = cdm.getVecClosure(csec, coords_gvec, i)
>>>>     print(f'coordinates for cell {i} :\n{cell_coords.reshape([-1, 3])}')
>>>>     cl = dm.getTransitiveClosure(i)
>>>>     print('closure:', cl)
>>>>     break
>>>> ```
>>>>
>>>> Best wishes,
>>>> Zongze
>>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://www.cse.buffalo.edu/~knepley/
>>> <http://www.cse.buffalo.edu/~knepley/>
>>>
>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220618/4a41551c/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cube-p2.msh
Type: application/octet-stream
Size: 5210 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220618/4a41551c/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test_gmsh_load_2rd.c
Type: application/octet-stream
Size: 2297 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220618/4a41551c/attachment-0003.obj>

From jroman at dsic.upv.es  Sat Jun 18 01:27:25 2022
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Sat, 18 Jun 2022 08:27:25 +0200
Subject: [petsc-users] Parallel matrix-vector product with Matrix Shell
In-Reply-To: <CABcfRi1Xuz86JxhEBH3YGTGRSxgh6j3FhKTN40vG9Y7Btxu5nQ@mail.gmail.com>
References: <CABcfRi2c+Jzw+o6PM0rN_vnCJ9_tdQ+zS2nqyK2mf_4f1jbaVA@mail.gmail.com>
	<3AAEAEF6-82A9-478F-BAD9-DC3AE6B0025C@dsic.upv.es>
	<CABcfRi14EUfN7OwwW0U5kH+BXJdeDBMCgu83f8CQumh7OqPrZw@mail.gmail.com>
	<CABcfRi1Xuz86JxhEBH3YGTGRSxgh6j3FhKTN40vG9Y7Btxu5nQ@mail.gmail.com>
Message-ID: <35218C6D-1BA4-4E2D-8A27-8860B6CF3E33@dsic.upv.es>

The initial vector of the Krylov method is by default a random vector, which is different when you change the number of processes. To avoid this, you can run with the undocumented option -bv_reproducible_random which will generate the same random initial vector irrespective of the number of processes.

Alternatively, set an initial vector in your code with EPSSetInitialSpace(), see e.g. https://slepc.upv.es/documentation/current/src/eps/tutorials/ex5.c.html

Jose


> El 18 jun 2022, a las 8:13, Mario Rossi <dfatiac at gmail.com> escribi?:
> 
> Dear Jose and Petsc users, I implemented the parallel matrix-vector product and it works meaning that it produces a result but it is different from the result produced with a single task.
> Obviously, I could be wrong in my implementation but what puzzles me is that the input vector (x) to the product is different running with one and two tasks and this is from the very first iteration (so it can not be due to a previous error in the product).
> I checked that X is different with one and two tasks with the following (naive) code
> PetscErrorCode MatMult_TM(Mat A,Vec x,Vec y) {
>   void              *ctx;
>   PetscInt          nx /* ,lo,i,j*/;
>   const PetscScalar *px;
>   PetscScalar       *py;
>   MPI_Comm          comm;
>   PetscFunctionBeginUser;
>   PetscCall(MatShellGetContext(A,&ctx));
>   PetscCall(VecGetLocalSize(x,&nx));
>   PetscCall(PetscObjectGetComm((PetscObject)A,&comm));
>   
>   //  nx = *(int*)ctx;
>   PetscCall(VecGetArrayRead(x,&px));
>   PetscCall(VecGetArray(y,&py));
>  
>   for(int i=0;i<nx;i++){ printf("task %d, px[%d]=%f, w[%d]=%f\n",myrank,i+offset,px[i],i+offset,w[i+offset]); }
>   PetscCall(MPI_Barrier(comm));
>   exit(0);
>  ......
> } 
> 
> Then I reordered the output obtained with one and two tasks. The first part of the x vector is very similar (but not exactly the same) using one and two tasks but the second part (belonging to the second task) is pretty different
> (here "offset" is   offset=(n/size)*myrank;)
> I create the matrix shell with
>   PetscCall(MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,N,N,&n,&A));
> I am sure I am doing something wrong but I don't know what I need to look at.
> Thanks in advance!
> Mario
> 
> 
> Il giorno ven 17 giu 2022 alle ore 17:56 Mario Rossi <dfatiac at gmail.com> ha scritto:
> Thanks a lot, Jose!
> I looked at the eps folder (where I found the test8.c that has been my starting point) but I did not look at the nep folder (my fault!)
> Thanks again,
> Mario
> 
> Il giorno ven 17 giu 2022 alle ore 17:34 Jose E. Roman <jroman at dsic.upv.es> ha scritto:
> You can use VecGetOwnershipRange() to determine the range of global indices corresponding to the local portion of a vector, and VecGetArray() to access the values. In SLEPc, you can assume that X and Y will have the same parallel distribution.
> 
> For an example of a shell matrix that implements the matrix-vector product in parallel, have a look at this: https://slepc.upv.es/documentation/current/src/nep/tutorials/ex21.c.html
> It is a simple tridiagonal example, where neighborwise communication is done with two calls to MPI_Sendrecv().
> 
> Jose
> 
> 
> > El 17 jun 2022, a las 17:21, Mario Rossi <dfatiac at gmail.com> escribi?:
> > 
> > I need to find the largest eigenvalues (say the first three) of a very large matrix and I am using
> > a combination of PetSc and SLEPc. In particular, I am using a shell matrix. I wrote a "custom"
> > matrix-vector product and everything works fine in serial (one task) mode for a "small" case.
> > For the real case, I need multiple (at least 128) tasks for memory reasons so I need a parallel variant of the custom matrix-vector product. I know exactly how to write the parallel variant 
> > (in plain MPI) but I am, somehow, blocked because it is not clear to me what each task receives
> > and what is expected to provide in the parallel matrix-vector product.
> > More in detail, with a single task, the function receives the full X vector and is expected to provide the full Y vector resulting from Y=A*X. 
> > What does it happen with multiple tasks? If I understand correctly
> > in the matrix shell definition, I can choose to split the matrix into blocks of rows so that the matrix-vector function should compute a block of elements of the vector Y but does it receive only the corresponding subset of the X (input vector)? (this is what I guess happens) and in output, does
> > each task return its subset of elements of Y as if it were the whole array and then PetSc manages all the subsets? Is there anyone who has a working example of a parallel matrix-vector product for matrix shell?
> > Thanks in advance for any help you can provide!
> > Mario
> > i
> > 
> 


From knepley at gmail.com  Sat Jun 18 07:02:44 2022
From: knepley at gmail.com (Matthew Knepley)
Date: Sat, 18 Jun 2022 08:02:44 -0400
Subject: [petsc-users] How to find the map between the high order
 coordinates of DMPlex and vertex numbering?
In-Reply-To: <CA+K_gXBz4vVxZd19ntFrkWrVgNc_CS309FqRyv2DQoa7axP-Rw@mail.gmail.com>
References: <CAMYG4Gkd7n7p6Ph+w8kN0DOxRFaxBr1sm6PFc15CP+5BT4uGww@mail.gmail.com>
	<2640A1A9-101C-4DFB-BFA4-C64AF231732A@gmail.com>
	<CAMYG4GkV1gCOL=zMKv+8FP7DXh95TLO234Du-mozycfWAVTvhA@mail.gmail.com>
	<CA+K_gXABGuMWmHrGA-Er4OH2FzNWHom7=0nPgBQJkbpo3cShdw@mail.gmail.com>
	<CA+K_gXBz4vVxZd19ntFrkWrVgNc_CS309FqRyv2DQoa7axP-Rw@mail.gmail.com>
Message-ID: <CAMYG4Gk141ZFZCZC=5fg+i7vf2Foc8qDQ56_BDucZc5w-We-mg@mail.gmail.com>

On Sat, Jun 18, 2022 at 2:16 AM Zongze Yang <yangzongze at gmail.com> wrote:

> In order to check if I made mistakes in the python code, I try to use c
> code to show the issue on DMProjectCoordinates. The code and mesh file is
> attached.
> If the code is correct, there must be something wrong with
> `DMProjectCoordinates` or `DMPlexCreateGmshFromFile` for high-order mesh.
>

Something is definitely wrong with high order, periodic simplices from
Gmsh. We had not tested that case. I am at a conference and cannot look at
it for a week.
My suspicion is that the space we make when reading in the Gmsh coordinates
does not match the values (wrong order).

  Thanks,

    Matt


> The command and the output are listed below: (Obviously the bounding box
> is changed.)
> ```
> $ ./test_gmsh_load_2rd -filename cube-p2.msh -old_fe_view -new_fe_view
> Old Bounding Box:
>   0: lo = 0. hi = 1.
>   1: lo = 0. hi = 1.
>   2: lo = 0. hi = 1.
> PetscFE Object: OldCoordinatesFE 1 MPI processes
>   type: basic
>   Basic Finite Element in 3 dimensions with 3 components
>   PetscSpace Object: P2 1 MPI processes
>     type: sum
>     Space in 3 variables with 3 components, size 30
>     Sum space of 3 concatenated subspaces (all identical)
>       PetscSpace Object: sum component (sumcomp_) 1 MPI processes
>         type: poly
>         Space in 3 variables with 1 components, size 10
>         Polynomial space of degree 2
>   PetscDualSpace Object: P2 1 MPI processes
>     type: lagrange
>     Dual space with 3 components, size 30
>     Discontinuous Lagrange dual space
>     Quadrature of order 5 on 27 points (dim 3)
> PetscFE Object: NewCoordinatesFE 1 MPI processes
>   type: basic
>   Basic Finite Element in 3 dimensions with 3 components
>   PetscSpace Object: P2 1 MPI processes
>     type: sum
>     Space in 3 variables with 3 components, size 30
>     Sum space of 3 concatenated subspaces (all identical)
>       PetscSpace Object: sum component (sumcomp_) 1 MPI processes
>         type: poly
>         Space in 3 variables with 1 components, size 10
>         Polynomial space of degree 2
>   PetscDualSpace Object: P2 1 MPI processes
>     type: lagrange
>     Dual space with 3 components, size 30
>     Continuous Lagrange dual space
>     Quadrature of order 5 on 27 points (dim 3)
> New Bounding Box:
>   0: lo = 2.5624e-17 hi = 8.
>   1: lo = -9.23372e-17 hi = 7.
>   2: lo = 2.72091e-17 hi = 8.5
> ```
>
> Thanks,
> Zongze
>
> Zongze Yang <yangzongze at gmail.com> ?2022?6?17??? 14:54???
>
>> I tried the projection operation. However, it seems that the projection
>> gives the wrong solution. After projection, the bounding box is changed!
>> See logs below.
>>
>> First, I patch the petsc4py by adding `DMProjectCoordinates`:
>> ```
>> diff --git a/src/binding/petsc4py/src/PETSc/DM.pyx
>> b/src/binding/petsc4py/src/PETSc/DM.pyx
>> index d8a58d183a..dbcdb280f1 100644
>> --- a/src/binding/petsc4py/src/PETSc/DM.pyx
>> +++ b/src/binding/petsc4py/src/PETSc/DM.pyx
>> @@ -307,6 +307,12 @@ cdef class DM(Object):
>>          PetscINCREF(c.obj)
>>          return c
>>
>> +    def projectCoordinates(self, FE fe=None):
>> +        if fe is None:
>> +            CHKERR( DMProjectCoordinates(self.dm, NULL) )
>> +        else:
>> +            CHKERR( DMProjectCoordinates(self.dm, fe.fe) )
>> +
>>      def getBoundingBox(self):
>>          cdef PetscInt i,dim=0
>>          CHKERR( DMGetCoordinateDim(self.dm, &dim) )
>> diff --git a/src/binding/petsc4py/src/PETSc/petscdm.pxi
>> b/src/binding/petsc4py/src/PETSc/petscdm.pxi
>> index 514b6fa472..c778e39884 100644
>> --- a/src/binding/petsc4py/src/PETSc/petscdm.pxi
>> +++ b/src/binding/petsc4py/src/PETSc/petscdm.pxi
>> @@ -90,6 +90,7 @@ cdef extern from * nogil:
>>      int DMGetCoordinateDim(PetscDM,PetscInt*)
>>      int DMSetCoordinateDim(PetscDM,PetscInt)
>>      int DMLocalizeCoordinates(PetscDM)
>> +    int DMProjectCoordinates(PetscDM, PetscFE)
>>
>>      int DMCreateInterpolation(PetscDM,PetscDM,PetscMat*,PetscVec*)
>>      int DMCreateInjection(PetscDM,PetscDM,PetscMat*)
>> ```
>>
>> Then in python, I load a mesh and project the coordinates to P2:
>> ```
>> import firedrake as fd
>> from firedrake.petsc import PETSc
>>
>> # plex = fd.mesh._from_gmsh('test-fd-load-p2.msh')
>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh')
>> print('old bbox:', plex.getBoundingBox())
>>
>> dim = plex.getDimension()
>> #                                 (dim,  nc, isSimplex, k,
>>  qorder, comm=None)
>> fe_new = PETSc.FE().createLagrange(dim, dim,      True, 2,
>> PETSc.DETERMINE)
>> plex.projectCoordinates(fe_new)
>> fe_new.view()
>>
>> print('new bbox:', plex.getBoundingBox())
>> ```
>>
>> The output is (The bounding box is changed!)
>> ```
>>
>> old bbox: ((0.0, 1.0), (0.0, 1.0), (0.0, 1.0))
>> PetscFE Object: P2 1 MPI processes
>>   type: basic
>>   Basic Finite Element in 3 dimensions with 3 components
>>   PetscSpace Object: P2 1 MPI processes
>>     type: sum
>>     Space in 3 variables with 3 components, size 30
>>     Sum space of 3 concatenated subspaces (all identical)
>>       PetscSpace Object: sum component (sumcomp_) 1 MPI processes
>>         type: poly
>>         Space in 3 variables with 1 components, size 10
>>         Polynomial space of degree 2
>>   PetscDualSpace Object: P2 1 MPI processes
>>     type: lagrange
>>     Dual space with 3 components, size 30
>>     Continuous Lagrange dual space
>>     Quadrature of order 5 on 27 points (dim 3)
>> new bbox: ((-6.530133708576188e-17, 36.30670832662781), (-3.899962995254311e-17, 36.2406171632539), (-8.8036464152166e-17, 36.111577025012224))
>>
>> ```
>>
>>
>> By the way, for the original DG coordinates, where can I find the relation of the closure and the order of the dofs for the cell?
>>
>>
>> Thanks!
>>
>>
>>   Zongze
>>
>>
>>
>> Matthew Knepley <knepley at gmail.com> ?2022?6?17??? 01:11???
>>
>>> On Thu, Jun 16, 2022 at 12:06 PM Zongze Yang <yangzongze at gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> ? 2022?6?16??23:22?Matthew Knepley <knepley at gmail.com> ???
>>>>
>>>> ?
>>>> On Thu, Jun 16, 2022 at 11:11 AM Zongze Yang <yangzongze at gmail.com>
>>>> wrote:
>>>>
>>>>> Hi, if I load a `gmsh` file with second-order elements, the
>>>>> coordinates will be stored in a DG-P2 space. After obtaining the
>>>>> coordinates of a cell, how can I map the coordinates to vertex and edge?
>>>>>
>>>>
>>>> By default, they are stored as P2, not DG.
>>>>
>>>>
>>>> I checked the coordinates vector, and found the dogs only defined on
>>>> cell other than vertex and edge, so I said they are stored as DG.
>>>> Then the function DMPlexVecGetClosure
>>>> <https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexVecGetClosure/> seems return
>>>> the coordinates in lex order.
>>>>
>>>> Some code in reading gmsh file reads that
>>>>
>>>>
>>>> 1756:     if (isSimplex) continuity = PETSC_FALSE
>>>> <https://petsc.org/main/docs/manualpages/Sys/PETSC_FALSE/>; /* XXX
>>>> FIXME Requires DMPlexSetClosurePermutationLexicographic() */
>>>>
>>>>
>>>> 1758:     GmshCreateFE(comm, NULL, isSimplex, continuity, nodeType,
>>>> dim, coordDim, order, &fe)
>>>>
>>>>
>>>> The continuity is set to false for simplex.
>>>>
>>>
>>> Oh, yes. That needs to be fixed. For now, you can just project it to P2
>>> if you want using
>>>
>>>   https://petsc.org/main/docs/manualpages/DM/DMProjectCoordinates/
>>>
>>>   Thanks,
>>>
>>>      Matt
>>>
>>>
>>>> Thanks,
>>>> Zongze
>>>>
>>>> You can ask for the coordinates of a vertex or an edge directly using
>>>>
>>>>   https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexPointLocalRead/
>>>>
>>>> by giving the vertex or edge point. You can get all the coordinates on
>>>> a cell, in the closure order, using
>>>>
>>>>   https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexVecGetClosure/
>>>>
>>>>   Thanks,
>>>>
>>>>      Matt
>>>>
>>>>
>>>>> Below is some code load the gmsh file, I want to know the relation
>>>>> between `cl` and `cell_coords`.
>>>>>
>>>>> ```
>>>>> import firedrake as fd
>>>>> import numpy as np
>>>>>
>>>>> # Load gmsh file (2rd)
>>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh')
>>>>>
>>>>> cs, ce = plex.getHeightStratum(0)
>>>>>
>>>>> cdm = plex.getCoordinateDM()
>>>>> csec = dm.getCoordinateSection()
>>>>> coords_gvec = dm.getCoordinates()
>>>>>
>>>>> for i in range(cs, ce):
>>>>>     cell_coords = cdm.getVecClosure(csec, coords_gvec, i)
>>>>>     print(f'coordinates for cell {i} :\n{cell_coords.reshape([-1,
>>>>> 3])}')
>>>>>     cl = dm.getTransitiveClosure(i)
>>>>>     print('closure:', cl)
>>>>>     break
>>>>> ```
>>>>>
>>>>> Best wishes,
>>>>> Zongze
>>>>>
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> https://www.cse.buffalo.edu/~knepley/
>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>
>>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://www.cse.buffalo.edu/~knepley/
>>> <http://www.cse.buffalo.edu/~knepley/>
>>>
>>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220618/b6566f6c/attachment-0001.html>

From yangzongze at gmail.com  Sat Jun 18 07:31:10 2022
From: yangzongze at gmail.com (Zongze Yang)
Date: Sat, 18 Jun 2022 20:31:10 +0800
Subject: [petsc-users] How to find the map between the high order
 coordinates of DMPlex and vertex numbering?
In-Reply-To: <CAMYG4Gk141ZFZCZC=5fg+i7vf2Foc8qDQ56_BDucZc5w-We-mg@mail.gmail.com>
References: <CAMYG4Gkd7n7p6Ph+w8kN0DOxRFaxBr1sm6PFc15CP+5BT4uGww@mail.gmail.com>
	<2640A1A9-101C-4DFB-BFA4-C64AF231732A@gmail.com>
	<CAMYG4GkV1gCOL=zMKv+8FP7DXh95TLO234Du-mozycfWAVTvhA@mail.gmail.com>
	<CA+K_gXABGuMWmHrGA-Er4OH2FzNWHom7=0nPgBQJkbpo3cShdw@mail.gmail.com>
	<CA+K_gXBz4vVxZd19ntFrkWrVgNc_CS309FqRyv2DQoa7axP-Rw@mail.gmail.com>
	<CAMYG4Gk141ZFZCZC=5fg+i7vf2Foc8qDQ56_BDucZc5w-We-mg@mail.gmail.com>
Message-ID: <CA+K_gXDiGH78w2Fqa22nGHcqzkpOZHeQYiq1fdV1qt4UJuUbzw@mail.gmail.com>

Thank you for your reply. May I ask for some references on the order of the
dofs on PETSc's FE Space (especially high order elements)?

Thanks,

 Zongze

Matthew Knepley <knepley at gmail.com> ?2022?6?18??? 20:02???

> On Sat, Jun 18, 2022 at 2:16 AM Zongze Yang <yangzongze at gmail.com> wrote:
>
>> In order to check if I made mistakes in the python code, I try to use c
>> code to show the issue on DMProjectCoordinates. The code and mesh file is
>> attached.
>> If the code is correct, there must be something wrong with
>> `DMProjectCoordinates` or `DMPlexCreateGmshFromFile` for high-order mesh.
>>
>
> Something is definitely wrong with high order, periodic simplices from
> Gmsh. We had not tested that case. I am at a conference and cannot look at
> it for a week.
> My suspicion is that the space we make when reading in the Gmsh
> coordinates does not match the values (wrong order).
>
>   Thanks,
>
>     Matt
>
>
>> The command and the output are listed below: (Obviously the bounding box
>> is changed.)
>> ```
>> $ ./test_gmsh_load_2rd -filename cube-p2.msh -old_fe_view -new_fe_view
>> Old Bounding Box:
>>   0: lo = 0. hi = 1.
>>   1: lo = 0. hi = 1.
>>   2: lo = 0. hi = 1.
>> PetscFE Object: OldCoordinatesFE 1 MPI processes
>>   type: basic
>>   Basic Finite Element in 3 dimensions with 3 components
>>   PetscSpace Object: P2 1 MPI processes
>>     type: sum
>>     Space in 3 variables with 3 components, size 30
>>     Sum space of 3 concatenated subspaces (all identical)
>>       PetscSpace Object: sum component (sumcomp_) 1 MPI processes
>>         type: poly
>>         Space in 3 variables with 1 components, size 10
>>         Polynomial space of degree 2
>>   PetscDualSpace Object: P2 1 MPI processes
>>     type: lagrange
>>     Dual space with 3 components, size 30
>>     Discontinuous Lagrange dual space
>>     Quadrature of order 5 on 27 points (dim 3)
>> PetscFE Object: NewCoordinatesFE 1 MPI processes
>>   type: basic
>>   Basic Finite Element in 3 dimensions with 3 components
>>   PetscSpace Object: P2 1 MPI processes
>>     type: sum
>>     Space in 3 variables with 3 components, size 30
>>     Sum space of 3 concatenated subspaces (all identical)
>>       PetscSpace Object: sum component (sumcomp_) 1 MPI processes
>>         type: poly
>>         Space in 3 variables with 1 components, size 10
>>         Polynomial space of degree 2
>>   PetscDualSpace Object: P2 1 MPI processes
>>     type: lagrange
>>     Dual space with 3 components, size 30
>>     Continuous Lagrange dual space
>>     Quadrature of order 5 on 27 points (dim 3)
>> New Bounding Box:
>>   0: lo = 2.5624e-17 hi = 8.
>>   1: lo = -9.23372e-17 hi = 7.
>>   2: lo = 2.72091e-17 hi = 8.5
>> ```
>>
>> Thanks,
>> Zongze
>>
>> Zongze Yang <yangzongze at gmail.com> ?2022?6?17??? 14:54???
>>
>>> I tried the projection operation. However, it seems that the projection
>>> gives the wrong solution. After projection, the bounding box is changed!
>>> See logs below.
>>>
>>> First, I patch the petsc4py by adding `DMProjectCoordinates`:
>>> ```
>>> diff --git a/src/binding/petsc4py/src/PETSc/DM.pyx
>>> b/src/binding/petsc4py/src/PETSc/DM.pyx
>>> index d8a58d183a..dbcdb280f1 100644
>>> --- a/src/binding/petsc4py/src/PETSc/DM.pyx
>>> +++ b/src/binding/petsc4py/src/PETSc/DM.pyx
>>> @@ -307,6 +307,12 @@ cdef class DM(Object):
>>>          PetscINCREF(c.obj)
>>>          return c
>>>
>>> +    def projectCoordinates(self, FE fe=None):
>>> +        if fe is None:
>>> +            CHKERR( DMProjectCoordinates(self.dm, NULL) )
>>> +        else:
>>> +            CHKERR( DMProjectCoordinates(self.dm, fe.fe) )
>>> +
>>>      def getBoundingBox(self):
>>>          cdef PetscInt i,dim=0
>>>          CHKERR( DMGetCoordinateDim(self.dm, &dim) )
>>> diff --git a/src/binding/petsc4py/src/PETSc/petscdm.pxi
>>> b/src/binding/petsc4py/src/PETSc/petscdm.pxi
>>> index 514b6fa472..c778e39884 100644
>>> --- a/src/binding/petsc4py/src/PETSc/petscdm.pxi
>>> +++ b/src/binding/petsc4py/src/PETSc/petscdm.pxi
>>> @@ -90,6 +90,7 @@ cdef extern from * nogil:
>>>      int DMGetCoordinateDim(PetscDM,PetscInt*)
>>>      int DMSetCoordinateDim(PetscDM,PetscInt)
>>>      int DMLocalizeCoordinates(PetscDM)
>>> +    int DMProjectCoordinates(PetscDM, PetscFE)
>>>
>>>      int DMCreateInterpolation(PetscDM,PetscDM,PetscMat*,PetscVec*)
>>>      int DMCreateInjection(PetscDM,PetscDM,PetscMat*)
>>> ```
>>>
>>> Then in python, I load a mesh and project the coordinates to P2:
>>> ```
>>> import firedrake as fd
>>> from firedrake.petsc import PETSc
>>>
>>> # plex = fd.mesh._from_gmsh('test-fd-load-p2.msh')
>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh')
>>> print('old bbox:', plex.getBoundingBox())
>>>
>>> dim = plex.getDimension()
>>> #                                 (dim,  nc, isSimplex, k,
>>>  qorder, comm=None)
>>> fe_new = PETSc.FE().createLagrange(dim, dim,      True, 2,
>>> PETSc.DETERMINE)
>>> plex.projectCoordinates(fe_new)
>>> fe_new.view()
>>>
>>> print('new bbox:', plex.getBoundingBox())
>>> ```
>>>
>>> The output is (The bounding box is changed!)
>>> ```
>>>
>>> old bbox: ((0.0, 1.0), (0.0, 1.0), (0.0, 1.0))
>>> PetscFE Object: P2 1 MPI processes
>>>   type: basic
>>>   Basic Finite Element in 3 dimensions with 3 components
>>>   PetscSpace Object: P2 1 MPI processes
>>>     type: sum
>>>     Space in 3 variables with 3 components, size 30
>>>     Sum space of 3 concatenated subspaces (all identical)
>>>       PetscSpace Object: sum component (sumcomp_) 1 MPI processes
>>>         type: poly
>>>         Space in 3 variables with 1 components, size 10
>>>         Polynomial space of degree 2
>>>   PetscDualSpace Object: P2 1 MPI processes
>>>     type: lagrange
>>>     Dual space with 3 components, size 30
>>>     Continuous Lagrange dual space
>>>     Quadrature of order 5 on 27 points (dim 3)
>>> new bbox: ((-6.530133708576188e-17, 36.30670832662781), (-3.899962995254311e-17, 36.2406171632539), (-8.8036464152166e-17, 36.111577025012224))
>>>
>>> ```
>>>
>>>
>>> By the way, for the original DG coordinates, where can I find the relation of the closure and the order of the dofs for the cell?
>>>
>>>
>>> Thanks!
>>>
>>>
>>>   Zongze
>>>
>>>
>>>
>>> Matthew Knepley <knepley at gmail.com> ?2022?6?17??? 01:11???
>>>
>>>> On Thu, Jun 16, 2022 at 12:06 PM Zongze Yang <yangzongze at gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> ? 2022?6?16??23:22?Matthew Knepley <knepley at gmail.com> ???
>>>>>
>>>>> ?
>>>>> On Thu, Jun 16, 2022 at 11:11 AM Zongze Yang <yangzongze at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi, if I load a `gmsh` file with second-order elements, the
>>>>>> coordinates will be stored in a DG-P2 space. After obtaining the
>>>>>> coordinates of a cell, how can I map the coordinates to vertex and edge?
>>>>>>
>>>>>
>>>>> By default, they are stored as P2, not DG.
>>>>>
>>>>>
>>>>> I checked the coordinates vector, and found the dogs only defined on
>>>>> cell other than vertex and edge, so I said they are stored as DG.
>>>>> Then the function DMPlexVecGetClosure
>>>>> <https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexVecGetClosure/> seems return
>>>>> the coordinates in lex order.
>>>>>
>>>>> Some code in reading gmsh file reads that
>>>>>
>>>>>
>>>>> 1756:     if (isSimplex) continuity = PETSC_FALSE
>>>>> <https://petsc.org/main/docs/manualpages/Sys/PETSC_FALSE/>; /* XXX
>>>>> FIXME Requires DMPlexSetClosurePermutationLexicographic() */
>>>>>
>>>>>
>>>>> 1758:     GmshCreateFE(comm, NULL, isSimplex, continuity, nodeType,
>>>>> dim, coordDim, order, &fe)
>>>>>
>>>>>
>>>>> The continuity is set to false for simplex.
>>>>>
>>>>
>>>> Oh, yes. That needs to be fixed. For now, you can just project it to P2
>>>> if you want using
>>>>
>>>>   https://petsc.org/main/docs/manualpages/DM/DMProjectCoordinates/
>>>>
>>>>   Thanks,
>>>>
>>>>      Matt
>>>>
>>>>
>>>>> Thanks,
>>>>> Zongze
>>>>>
>>>>> You can ask for the coordinates of a vertex or an edge directly using
>>>>>
>>>>>   https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexPointLocalRead/
>>>>>
>>>>> by giving the vertex or edge point. You can get all the coordinates on
>>>>> a cell, in the closure order, using
>>>>>
>>>>>   https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexVecGetClosure/
>>>>>
>>>>>   Thanks,
>>>>>
>>>>>      Matt
>>>>>
>>>>>
>>>>>> Below is some code load the gmsh file, I want to know the relation
>>>>>> between `cl` and `cell_coords`.
>>>>>>
>>>>>> ```
>>>>>> import firedrake as fd
>>>>>> import numpy as np
>>>>>>
>>>>>> # Load gmsh file (2rd)
>>>>>> plex = fd.mesh._from_gmsh('test-fd-load-p2-rect.msh')
>>>>>>
>>>>>> cs, ce = plex.getHeightStratum(0)
>>>>>>
>>>>>> cdm = plex.getCoordinateDM()
>>>>>> csec = dm.getCoordinateSection()
>>>>>> coords_gvec = dm.getCoordinates()
>>>>>>
>>>>>> for i in range(cs, ce):
>>>>>>     cell_coords = cdm.getVecClosure(csec, coords_gvec, i)
>>>>>>     print(f'coordinates for cell {i} :\n{cell_coords.reshape([-1,
>>>>>> 3])}')
>>>>>>     cl = dm.getTransitiveClosure(i)
>>>>>>     print('closure:', cl)
>>>>>>     break
>>>>>> ```
>>>>>>
>>>>>> Best wishes,
>>>>>> Zongze
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> What most experimenters take for granted before they begin their
>>>>> experiments is infinitely more interesting than any results to which their
>>>>> experiments lead.
>>>>> -- Norbert Wiener
>>>>>
>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>
>>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> https://www.cse.buffalo.edu/~knepley/
>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>
>>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220618/0dc5c7a2/attachment.html>

From dfatiac at gmail.com  Sat Jun 18 09:30:03 2022
From: dfatiac at gmail.com (Mario Rossi)
Date: Sat, 18 Jun 2022 16:30:03 +0200
Subject: [petsc-users] Parallel matrix-vector product with Matrix Shell
In-Reply-To: <35218C6D-1BA4-4E2D-8A27-8860B6CF3E33@dsic.upv.es>
References: <CABcfRi2c+Jzw+o6PM0rN_vnCJ9_tdQ+zS2nqyK2mf_4f1jbaVA@mail.gmail.com>
	<3AAEAEF6-82A9-478F-BAD9-DC3AE6B0025C@dsic.upv.es>
	<CABcfRi14EUfN7OwwW0U5kH+BXJdeDBMCgu83f8CQumh7OqPrZw@mail.gmail.com>
	<CABcfRi1Xuz86JxhEBH3YGTGRSxgh6j3FhKTN40vG9Y7Btxu5nQ@mail.gmail.com>
	<35218C6D-1BA4-4E2D-8A27-8860B6CF3E33@dsic.upv.es>
Message-ID: <CABcfRi0Qty6B0y64X2yDf8xrf1z0mwqDXDpEqvTRH6Hxd2v+sA@mail.gmail.com>

Thanks again Jose for your prompt and very useful indication. By using
that, I could understand where the REAL problem was (and obviously it was
my fault).
Now everything works smoothly and produces the expected result.
All the best,
Mario

Il giorno sab 18 giu 2022 alle ore 08:27 Jose E. Roman <jroman at dsic.upv.es>
ha scritto:

> The initial vector of the Krylov method is by default a random vector,
> which is different when you change the number of processes. To avoid this,
> you can run with the undocumented option -bv_reproducible_random which will
> generate the same random initial vector irrespective of the number of
> processes.
>
> Alternatively, set an initial vector in your code with
> EPSSetInitialSpace(), see e.g.
> https://slepc.upv.es/documentation/current/src/eps/tutorials/ex5.c.html
>
> Jose
>
>
> > El 18 jun 2022, a las 8:13, Mario Rossi <dfatiac at gmail.com> escribi?:
> >
> > Dear Jose and Petsc users, I implemented the parallel matrix-vector
> product and it works meaning that it produces a result but it is different
> from the result produced with a single task.
> > Obviously, I could be wrong in my implementation but what puzzles me is
> that the input vector (x) to the product is different running with one and
> two tasks and this is from the very first iteration (so it can not be due
> to a previous error in the product).
> > I checked that X is different with one and two tasks with the following
> (naive) code
> > PetscErrorCode MatMult_TM(Mat A,Vec x,Vec y) {
> >   void              *ctx;
> >   PetscInt          nx /* ,lo,i,j*/;
> >   const PetscScalar *px;
> >   PetscScalar       *py;
> >   MPI_Comm          comm;
> >   PetscFunctionBeginUser;
> >   PetscCall(MatShellGetContext(A,&ctx));
> >   PetscCall(VecGetLocalSize(x,&nx));
> >   PetscCall(PetscObjectGetComm((PetscObject)A,&comm));
> >
> >   //  nx = *(int*)ctx;
> >   PetscCall(VecGetArrayRead(x,&px));
> >   PetscCall(VecGetArray(y,&py));
> >
> >   for(int i=0;i<nx;i++){ printf("task %d, px[%d]=%f,
> w[%d]=%f\n",myrank,i+offset,px[i],i+offset,w[i+offset]); }
> >   PetscCall(MPI_Barrier(comm));
> >   exit(0);
> >  ......
> > }
> >
> > Then I reordered the output obtained with one and two tasks. The first
> part of the x vector is very similar (but not exactly the same) using one
> and two tasks but the second part (belonging to the second task) is pretty
> different
> > (here "offset" is   offset=(n/size)*myrank;)
> > I create the matrix shell with
> >
>  PetscCall(MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,N,N,&n,&A));
> > I am sure I am doing something wrong but I don't know what I need to
> look at.
> > Thanks in advance!
> > Mario
> >
> >
> > Il giorno ven 17 giu 2022 alle ore 17:56 Mario Rossi <dfatiac at gmail.com>
> ha scritto:
> > Thanks a lot, Jose!
> > I looked at the eps folder (where I found the test8.c that has been my
> starting point) but I did not look at the nep folder (my fault!)
> > Thanks again,
> > Mario
> >
> > Il giorno ven 17 giu 2022 alle ore 17:34 Jose E. Roman <
> jroman at dsic.upv.es> ha scritto:
> > You can use VecGetOwnershipRange() to determine the range of global
> indices corresponding to the local portion of a vector, and VecGetArray()
> to access the values. In SLEPc, you can assume that X and Y will have the
> same parallel distribution.
> >
> > For an example of a shell matrix that implements the matrix-vector
> product in parallel, have a look at this:
> https://slepc.upv.es/documentation/current/src/nep/tutorials/ex21.c.html
> > It is a simple tridiagonal example, where neighborwise communication is
> done with two calls to MPI_Sendrecv().
> >
> > Jose
> >
> >
> > > El 17 jun 2022, a las 17:21, Mario Rossi <dfatiac at gmail.com> escribi?:
> > >
> > > I need to find the largest eigenvalues (say the first three) of a very
> large matrix and I am using
> > > a combination of PetSc and SLEPc. In particular, I am using a shell
> matrix. I wrote a "custom"
> > > matrix-vector product and everything works fine in serial (one task)
> mode for a "small" case.
> > > For the real case, I need multiple (at least 128) tasks for memory
> reasons so I need a parallel variant of the custom matrix-vector product. I
> know exactly how to write the parallel variant
> > > (in plain MPI) but I am, somehow, blocked because it is not clear to
> me what each task receives
> > > and what is expected to provide in the parallel matrix-vector product.
> > > More in detail, with a single task, the function receives the full X
> vector and is expected to provide the full Y vector resulting from Y=A*X.
> > > What does it happen with multiple tasks? If I understand correctly
> > > in the matrix shell definition, I can choose to split the matrix into
> blocks of rows so that the matrix-vector function should compute a block of
> elements of the vector Y but does it receive only the corresponding subset
> of the X (input vector)? (this is what I guess happens) and in output, does
> > > each task return its subset of elements of Y as if it were the whole
> array and then PetSc manages all the subsets? Is there anyone who has a
> working example of a parallel matrix-vector product for matrix shell?
> > > Thanks in advance for any help you can provide!
> > > Mario
> > > i
> > >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220618/17ccdc52/attachment-0001.html>

From kcpkumar33 at gmail.com  Sun Jun 19 16:07:43 2022
From: kcpkumar33 at gmail.com (Pavankumar Koratikere)
Date: Sun, 19 Jun 2022 17:07:43 -0400
Subject: [petsc-users] PETSc Segmentation Violation error
Message-ID: <CA+_2qtz+cv+uBPywd0HrFus=q2=1fo4D6CuMWdRvbcqnN92x5w@mail.gmail.com>

Hello,

I am trying to run a script that uses packages that depend on OpenMPI and
PETSC (as shown below).

mpirun -np 4 python test.py

I am getting following error:

[1]PETSC ERROR:
------------------------------------------------------------------------
[1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
probably memory access out of range
[1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[1]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind
[1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
OS X to find memory corruption errors
[1]PETSC ERROR: likely location of problem given in stack below
[1]PETSC ERROR: ---------------------  Stack Frames
------------------------------------
[1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[1]PETSC ERROR:       INSTEAD the line number of the start of the function
[1]PETSC ERROR:       is given.
[1]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
[1]PETSC ERROR: Signal received
[1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[1]PETSC ERROR: Petsc Release Version 3.15.5, Sep 29, 2021
[1]PETSC ERROR: Unknown Name on a real-debug named skynet by pavan Sat
Jun 18 10:30:04 2022
[1]PETSC ERROR: Configure options --PETSC_ARCH=real-debug
--with-scalar-type=real --with-debugging=1
--with-mpi-dir=/home/pavan/packages/openmpi-4.0.7/opt-gfortran
--download-metis=yes --download-parmetis=yes
--download-superlu_dist=yes --with-shared-libraries=yes
--with-fortran-bindings=1 --with-cxx-dialect=C++11
[1]PETSC ERROR: #1 User provided function() at unknown file:0
[1]PETSC ERROR: Checking the memory for corruption.
[2]PETSC ERROR:
------------------------------------------------------------------------
[2]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
probably memory access out of range
[2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[2]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind
[2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
OS X to find memory corruption errors
[2]PETSC ERROR: likely location of problem given in stack below
[2]PETSC ERROR: ---------------------  Stack Frames
------------------------------------
[2]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[2]PETSC ERROR:       INSTEAD the line number of the start of the function
[2]PETSC ERROR:       is given.
[2]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
[2]PETSC ERROR: Signal received
[2]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[2]PETSC ERROR: Petsc Release Version 3.15.5, Sep 29, 2021
[2]PETSC ERROR: Unknown Name on a real-debug named skynet by pavan Sat
Jun 18 10:30:04 2022
[2]PETSC ERROR: Configure options --PETSC_ARCH=real-debug
--with-scalar-type=real --with-debugging=1
--with-mpi-dir=/home/pavan/packages/openmpi-4.0.7/opt-gfortran
--download-metis=yes --download-parmetis=yes
--download-superlu_dist=yes --with-shared-libraries=yes
--with-fortran-bindings=1 --with-cxx-dialect=C++11
[2]PETSC ERROR: #1 User provided function() at unknown file:0
[2]PETSC ERROR: Checking the memory for corruption.
[3]PETSC ERROR:
------------------------------------------------------------------------
[3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
probably memory access out of range
[3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[3]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind
[3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
OS X to find memory corruption errors
[3]PETSC ERROR: likely location of problem given in stack below
[3]PETSC ERROR: ---------------------  Stack Frames
------------------------------------
[3]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[3]PETSC ERROR:       INSTEAD the line number of the start of the function
[3]PETSC ERROR:       is given.
[3]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
[3]PETSC ERROR: Signal received
[3]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[3]PETSC ERROR: Petsc Release Version 3.15.5, Sep 29, 2021
[3]PETSC ERROR: Unknown Name on a real-debug named skynet by pavan Sat
Jun 18 10:30:04 2022
[3]PETSC ERROR: Configure options --PETSC_ARCH=real-debug
--with-scalar-type=real --with-debugging=1
--with-mpi-dir=/home/pavan/packages/openmpi-4.0.7/opt-gfortran
--download-metis=yes --download-parmetis=yes
--download-superlu_dist=yes --with-shared-libraries=yes
--with-fortran-bindings=1 --with-cxx-dialect=C++11
[3]PETSC ERROR: #1 User provided function() at unknown file:0
[3]PETSC ERROR: Checking the memory for corruption.
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
probably memory access out of range
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind
[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
OS X to find memory corruption errors
[0]PETSC ERROR: likely location of problem given in stack below
[0]PETSC ERROR: ---------------------  Stack Frames
------------------------------------
[0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[0]PETSC ERROR:       INSTEAD the line number of the start of the function
[0]PETSC ERROR:       is given.
[0]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
[0]PETSC ERROR: Signal received
[0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.15.5, Sep 29, 2021
[0]PETSC ERROR: Unknown Name on a real-debug named skynet by pavan Sat
Jun 18 10:30:04 2022
[0]PETSC ERROR: Configure options --PETSC_ARCH=real-debug
--with-scalar-type=real --with-debugging=1
--with-mpi-dir=/home/pavan/packages/openmpi-4.0.7/opt-gfortran
--download-metis=yes --download-parmetis=yes
--download-superlu_dist=yes --with-shared-libraries=yes
--with-fortran-bindings=1 --with-cxx-dialect=C++11
[0]PETSC ERROR: #1 User provided function() at unknown file:0
[0]PETSC ERROR: Checking the memory for corruption.

I am new to PETSc and I don't really know how to debug this. Any help
will be much appreciated!

Regards,

Pavan.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220619/44c0234d/attachment.html>

From balay at mcs.anl.gov  Sun Jun 19 20:59:07 2022
From: balay at mcs.anl.gov (Satish Balay)
Date: Mon, 20 Jun 2022 07:29:07 +0530 (IST)
Subject: [petsc-users] PETSc Segmentation Violation error
In-Reply-To: <CA+_2qtz+cv+uBPywd0HrFus=q2=1fo4D6CuMWdRvbcqnN92x5w@mail.gmail.com>
References: <CA+_2qtz+cv+uBPywd0HrFus=q2=1fo4D6CuMWdRvbcqnN92x5w@mail.gmail.com>
Message-ID: <f7f8823e-bf70-96-81e8-1a9a7969aea@mcs.anl.gov>

As the below message indicates - you can try running this code via valgrind or gdb to determine the location of error.

https://petsc.org/release/faq/#valgrind
i.e
mpirun -np 4 valgrind --tool=memcheck python test.py

One way to specify option  -start_in_debugger is:

PETSC_OPTION=-start_in_debugger mpirun -np 4 python test.py

Also good to use latest petsc version - currently its 3.17

Satish

On Sun, 19 Jun 2022, Pavankumar Koratikere wrote:

> Hello,
> 
> I am trying to run a script that uses packages that depend on OpenMPI and
> PETSC (as shown below).
> 
> mpirun -np 4 python test.py
> 
> I am getting following error:
> 
> [1]PETSC ERROR:
> ------------------------------------------------------------------------
> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [1]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind
> [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> OS X to find memory corruption errors
> [1]PETSC ERROR: likely location of problem given in stack below
> [1]PETSC ERROR: ---------------------  Stack Frames
> ------------------------------------
> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> [1]PETSC ERROR:       INSTEAD the line number of the start of the function
> [1]PETSC ERROR:       is given.
> [1]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [1]PETSC ERROR: Signal received
> [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
> [1]PETSC ERROR: Petsc Release Version 3.15.5, Sep 29, 2021
> [1]PETSC ERROR: Unknown Name on a real-debug named skynet by pavan Sat
> Jun 18 10:30:04 2022
> [1]PETSC ERROR: Configure options --PETSC_ARCH=real-debug
> --with-scalar-type=real --with-debugging=1
> --with-mpi-dir=/home/pavan/packages/openmpi-4.0.7/opt-gfortran
> --download-metis=yes --download-parmetis=yes
> --download-superlu_dist=yes --with-shared-libraries=yes
> --with-fortran-bindings=1 --with-cxx-dialect=C++11
> [1]PETSC ERROR: #1 User provided function() at unknown file:0
> [1]PETSC ERROR: Checking the memory for corruption.
> [2]PETSC ERROR:
> ------------------------------------------------------------------------
> [2]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> [2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [2]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind
> [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> OS X to find memory corruption errors
> [2]PETSC ERROR: likely location of problem given in stack below
> [2]PETSC ERROR: ---------------------  Stack Frames
> ------------------------------------
> [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> [2]PETSC ERROR:       INSTEAD the line number of the start of the function
> [2]PETSC ERROR:       is given.
> [2]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [2]PETSC ERROR: Signal received
> [2]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
> [2]PETSC ERROR: Petsc Release Version 3.15.5, Sep 29, 2021
> [2]PETSC ERROR: Unknown Name on a real-debug named skynet by pavan Sat
> Jun 18 10:30:04 2022
> [2]PETSC ERROR: Configure options --PETSC_ARCH=real-debug
> --with-scalar-type=real --with-debugging=1
> --with-mpi-dir=/home/pavan/packages/openmpi-4.0.7/opt-gfortran
> --download-metis=yes --download-parmetis=yes
> --download-superlu_dist=yes --with-shared-libraries=yes
> --with-fortran-bindings=1 --with-cxx-dialect=C++11
> [2]PETSC ERROR: #1 User provided function() at unknown file:0
> [2]PETSC ERROR: Checking the memory for corruption.
> [3]PETSC ERROR:
> ------------------------------------------------------------------------
> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [3]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind
> [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> OS X to find memory corruption errors
> [3]PETSC ERROR: likely location of problem given in stack below
> [3]PETSC ERROR: ---------------------  Stack Frames
> ------------------------------------
> [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> [3]PETSC ERROR:       INSTEAD the line number of the start of the function
> [3]PETSC ERROR:       is given.
> [3]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [3]PETSC ERROR: Signal received
> [3]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
> [3]PETSC ERROR: Petsc Release Version 3.15.5, Sep 29, 2021
> [3]PETSC ERROR: Unknown Name on a real-debug named skynet by pavan Sat
> Jun 18 10:30:04 2022
> [3]PETSC ERROR: Configure options --PETSC_ARCH=real-debug
> --with-scalar-type=real --with-debugging=1
> --with-mpi-dir=/home/pavan/packages/openmpi-4.0.7/opt-gfortran
> --download-metis=yes --download-parmetis=yes
> --download-superlu_dist=yes --with-shared-libraries=yes
> --with-fortran-bindings=1 --with-cxx-dialect=C++11
> [3]PETSC ERROR: #1 User provided function() at unknown file:0
> [3]PETSC ERROR: Checking the memory for corruption.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> OS X to find memory corruption errors
> [0]PETSC ERROR: likely location of problem given in stack below
> [0]PETSC ERROR: ---------------------  Stack Frames
> ------------------------------------
> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> [0]PETSC ERROR:       INSTEAD the line number of the start of the function
> [0]PETSC ERROR:       is given.
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [0]PETSC ERROR: Signal received
> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.15.5, Sep 29, 2021
> [0]PETSC ERROR: Unknown Name on a real-debug named skynet by pavan Sat
> Jun 18 10:30:04 2022
> [0]PETSC ERROR: Configure options --PETSC_ARCH=real-debug
> --with-scalar-type=real --with-debugging=1
> --with-mpi-dir=/home/pavan/packages/openmpi-4.0.7/opt-gfortran
> --download-metis=yes --download-parmetis=yes
> --download-superlu_dist=yes --with-shared-libraries=yes
> --with-fortran-bindings=1 --with-cxx-dialect=C++11
> [0]PETSC ERROR: #1 User provided function() at unknown file:0
> [0]PETSC ERROR: Checking the memory for corruption.
> 
> I am new to PETSc and I don't really know how to debug this. Any help
> will be much appreciated!
> 
> Regards,
> 
> Pavan.
> 


From jacob.fai at gmail.com  Mon Jun 20 07:56:43 2022
From: jacob.fai at gmail.com (Jacob Faibussowitsch)
Date: Mon, 20 Jun 2022 08:56:43 -0400
Subject: [petsc-users] PETSc Segmentation Violation error
In-Reply-To: <CA+_2qtyW37JYpvnZXFkbJSqus7Y2T9OPnsf45evRy1XPv41Hhw@mail.gmail.com>
References: <CA+_2qtz+cv+uBPywd0HrFus=q2=1fo4D6CuMWdRvbcqnN92x5w@mail.gmail.com>
	<A274C4C1-3F76-46EC-8C5F-85201462A671@gmail.com>
	<CA+_2qtyW37JYpvnZXFkbJSqus7Y2T9OPnsf45evRy1XPv41Hhw@mail.gmail.com>
Message-ID: <9BECDEDA-D113-43A8-B562-68CDBB16357E@gmail.com>

Glad everything worked out.

(I forgot to reply-all in my initial mail, so the mailing list did not get included, adding it back in now).

Best regards,

Jacob Faibussowitsch
(Jacob Fai - booss - oh - vitch)

> On Jun 20, 2022, at 08:54, Pavankumar Koratikere <kcpkumar33 at gmail.com> wrote:
> 
> Hello Jacob
> 
> Thanks for your reply! As you mentioned, there was a minor discrepancy in the installation of a package which uses PETSc. I changed the version of python with which the package was configured and everything ran as expected.
> 
> Regards,
> Pavan.
> 
> On Sun, Jun 19, 2022 at 9:09 PM Jacob Faibussowitsch <jacob.fai at gmail.com> wrote:
> > [1]PETSC ERROR: #1 User provided function() at unknown file:0
> 
> The error message indicates that the segmentation violation occurs outside of PETSc. PETSc registers a SIGSEGV signal handler on startup, hence why it is the one to catch this. If the error was occurring somewhere within PETSc, or within a user-function called by PETSc then this stack trace would be more complete. 
> 
> Without seeing the code you are running we unfortunately cannot pinpoint the problem.
> 
> Best regards,
> 
> Jacob Faibussowitsch
> (Jacob Fai - booss - oh - vitch)
> 
> > On Jun 19, 2022, at 17:07, Pavankumar Koratikere <kcpkumar33 at gmail.com> wrote:
> > 
> > Hello,
> > I am trying to run a script that uses packages that depend on OpenMPI and PETSC (as shown below).
> > 
> > mpirun -np 4 python test.py
> > I am getting following error:
> > [1]PETSC ERROR: ------------------------------------------------------------------------
> > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> > [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> > [1]PETSC ERROR: or see 
> > https://petsc.org/release/faq/#valgrind
> > 
> > [1]PETSC ERROR: or try 
> > http://valgrind.org
> >  on GNU/linux and Apple Mac OS X to find memory corruption errors
> > [1]PETSC ERROR: likely location of problem given in stack below
> > [1]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
> > [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> > [1]PETSC ERROR:       INSTEAD the line number of the start of the function
> > [1]PETSC ERROR:       is given.
> > [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> > [1]PETSC ERROR: Signal received
> > [1]PETSC ERROR: See 
> > https://petsc.org/release/faq/
> >  for trouble shooting.
> > [1]PETSC ERROR: Petsc Release Version 3.15.5, Sep 29, 2021 
> > [1]PETSC ERROR: Unknown Name on a real-debug named skynet by pavan Sat Jun 18 10:30:04 2022
> > [1]PETSC ERROR: Configure options --PETSC_ARCH=real-debug --with-scalar-type=real --with-debugging=1 --with-mpi-dir=/home/pavan/packages/openmpi-4.0.7/opt-gfortran --download-metis=yes --download-parmetis=yes --download-superlu_dist=yes --with-shared-libraries=yes --with-fortran-bindings=1 --with-cxx-dialect=C++11
> > [1]PETSC ERROR: #1 User provided function() at unknown file:0
> > [1]PETSC ERROR: Checking the memory for corruption.
> > [2]PETSC ERROR: ------------------------------------------------------------------------
> > [2]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> > [2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> > [2]PETSC ERROR: or see 
> > https://petsc.org/release/faq/#valgrind
> > 
> > [2]PETSC ERROR: or try 
> > http://valgrind.org
> >  on GNU/linux and Apple Mac OS X to find memory corruption errors
> > [2]PETSC ERROR: likely location of problem given in stack below
> > [2]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
> > [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> > [2]PETSC ERROR:       INSTEAD the line number of the start of the function
> > [2]PETSC ERROR:       is given.
> > [2]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> > [2]PETSC ERROR: Signal received
> > [2]PETSC ERROR: See 
> > https://petsc.org/release/faq/
> >  for trouble shooting.
> > [2]PETSC ERROR: Petsc Release Version 3.15.5, Sep 29, 2021 
> > [2]PETSC ERROR: Unknown Name on a real-debug named skynet by pavan Sat Jun 18 10:30:04 2022
> > [2]PETSC ERROR: Configure options --PETSC_ARCH=real-debug --with-scalar-type=real --with-debugging=1 --with-mpi-dir=/home/pavan/packages/openmpi-4.0.7/opt-gfortran --download-metis=yes --download-parmetis=yes --download-superlu_dist=yes --with-shared-libraries=yes --with-fortran-bindings=1 --with-cxx-dialect=C++11
> > [2]PETSC ERROR: #1 User provided function() at unknown file:0
> > [2]PETSC ERROR: Checking the memory for corruption.
> > [3]PETSC ERROR: ------------------------------------------------------------------------
> > [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> > [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> > [3]PETSC ERROR: or see 
> > https://petsc.org/release/faq/#valgrind
> > 
> > [3]PETSC ERROR: or try 
> > http://valgrind.org
> >  on GNU/linux and Apple Mac OS X to find memory corruption errors
> > [3]PETSC ERROR: likely location of problem given in stack below
> > [3]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
> > [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> > [3]PETSC ERROR:       INSTEAD the line number of the start of the function
> > [3]PETSC ERROR:       is given.
> > [3]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> > [3]PETSC ERROR: Signal received
> > [3]PETSC ERROR: See 
> > https://petsc.org/release/faq/
> >  for trouble shooting.
> > [3]PETSC ERROR: Petsc Release Version 3.15.5, Sep 29, 2021 
> > [3]PETSC ERROR: Unknown Name on a real-debug named skynet by pavan Sat Jun 18 10:30:04 2022
> > [3]PETSC ERROR: Configure options --PETSC_ARCH=real-debug --with-scalar-type=real --with-debugging=1 --with-mpi-dir=/home/pavan/packages/openmpi-4.0.7/opt-gfortran --download-metis=yes --download-parmetis=yes --download-superlu_dist=yes --with-shared-libraries=yes --with-fortran-bindings=1 --with-cxx-dialect=C++11
> > [3]PETSC ERROR: #1 User provided function() at unknown file:0
> > [3]PETSC ERROR: Checking the memory for corruption.
> > [0]PETSC ERROR: ------------------------------------------------------------------------
> > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> > [0]PETSC ERROR: or see 
> > https://petsc.org/release/faq/#valgrind
> > 
> > [0]PETSC ERROR: or try 
> > http://valgrind.org
> >  on GNU/linux and Apple Mac OS X to find memory corruption errors
> > [0]PETSC ERROR: likely location of problem given in stack below
> > [0]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
> > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> > [0]PETSC ERROR:       INSTEAD the line number of the start of the function
> > [0]PETSC ERROR:       is given.
> > [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> > [0]PETSC ERROR: Signal received
> > [0]PETSC ERROR: See 
> > https://petsc.org/release/faq/
> >  for trouble shooting.
> > [0]PETSC ERROR: Petsc Release Version 3.15.5, Sep 29, 2021 
> > [0]PETSC ERROR: Unknown Name on a real-debug named skynet by pavan Sat Jun 18 10:30:04 2022
> > [0]PETSC ERROR: Configure options --PETSC_ARCH=real-debug --with-scalar-type=real --with-debugging=1 --with-mpi-dir=/home/pavan/packages/openmpi-4.0.7/opt-gfortran --download-metis=yes --download-parmetis=yes --download-superlu_dist=yes --with-shared-libraries=yes --with-fortran-bindings=1 --with-cxx-dialect=C++11
> > [0]PETSC ERROR: #1 User provided function() at unknown file:0
> > [0]PETSC ERROR: Checking the memory for corruption.
> > 
> > 
> > I am new to PETSc and I don't really know how to debug this. Any help will be much appreciated!
> > Regards,
> > Pavan.
> 


From kcpkumar33 at gmail.com  Mon Jun 20 09:22:18 2022
From: kcpkumar33 at gmail.com (Pavankumar Koratikere)
Date: Mon, 20 Jun 2022 10:22:18 -0400
Subject: [petsc-users] PETSc Segmentation Violation error
In-Reply-To: <f7f8823e-bf70-96-81e8-1a9a7969aea@mcs.anl.gov>
References: <CA+_2qtz+cv+uBPywd0HrFus=q2=1fo4D6CuMWdRvbcqnN92x5w@mail.gmail.com>
	<f7f8823e-bf70-96-81e8-1a9a7969aea@mcs.anl.gov>
Message-ID: <CA+_2qtxg-juZFR0pRFJwT3Ukfdyz9px4+KhCVbmhBub3a+PZTQ@mail.gmail.com>

Hello Satish

Thanks for your email! There was a minor discrepancy in the installation of
a package which uses PETSc. I changed the version of python with which the
package was configured and everything ran as expected.

The package which I am using requires me to use a specific version of
PETSc, so I am using an old version.

Regards,
Pavan.

On Sun, Jun 19, 2022 at 9:59 PM Satish Balay <balay at mcs.anl.gov> wrote:

> As the below message indicates - you can try running this code via
> valgrind or gdb to determine the location of error.
>
> https://petsc.org/release/faq/#valgrind
> i.e
> mpirun -np 4 valgrind --tool=memcheck python test.py
>
> One way to specify option  -start_in_debugger is:
>
> PETSC_OPTION=-start_in_debugger mpirun -np 4 python test.py
>
> Also good to use latest petsc version - currently its 3.17
>
> Satish
>
> On Sun, 19 Jun 2022, Pavankumar Koratikere wrote:
>
> > Hello,
> >
> > I am trying to run a script that uses packages that depend on OpenMPI and
> > PETSC (as shown below).
> >
> > mpirun -np 4 python test.py
> >
> > I am getting following error:
> >
> > [1]PETSC ERROR:
> > ------------------------------------------------------------------------
> > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> > probably memory access out of range
> > [1]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> > [1]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind
> > [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> > OS X to find memory corruption errors
> > [1]PETSC ERROR: likely location of problem given in stack below
> > [1]PETSC ERROR: ---------------------  Stack Frames
> > ------------------------------------
> > [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not
> available,
> > [1]PETSC ERROR:       INSTEAD the line number of the start of the
> function
> > [1]PETSC ERROR:       is given.
> > [1]PETSC ERROR: --------------------- Error Message
> > --------------------------------------------------------------
> > [1]PETSC ERROR: Signal received
> > [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
> > [1]PETSC ERROR: Petsc Release Version 3.15.5, Sep 29, 2021
> > [1]PETSC ERROR: Unknown Name on a real-debug named skynet by pavan Sat
> > Jun 18 10:30:04 2022
> > [1]PETSC ERROR: Configure options --PETSC_ARCH=real-debug
> > --with-scalar-type=real --with-debugging=1
> > --with-mpi-dir=/home/pavan/packages/openmpi-4.0.7/opt-gfortran
> > --download-metis=yes --download-parmetis=yes
> > --download-superlu_dist=yes --with-shared-libraries=yes
> > --with-fortran-bindings=1 --with-cxx-dialect=C++11
> > [1]PETSC ERROR: #1 User provided function() at unknown file:0
> > [1]PETSC ERROR: Checking the memory for corruption.
> > [2]PETSC ERROR:
> > ------------------------------------------------------------------------
> > [2]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> > probably memory access out of range
> > [2]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> > [2]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind
> > [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> > OS X to find memory corruption errors
> > [2]PETSC ERROR: likely location of problem given in stack below
> > [2]PETSC ERROR: ---------------------  Stack Frames
> > ------------------------------------
> > [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not
> available,
> > [2]PETSC ERROR:       INSTEAD the line number of the start of the
> function
> > [2]PETSC ERROR:       is given.
> > [2]PETSC ERROR: --------------------- Error Message
> > --------------------------------------------------------------
> > [2]PETSC ERROR: Signal received
> > [2]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
> > [2]PETSC ERROR: Petsc Release Version 3.15.5, Sep 29, 2021
> > [2]PETSC ERROR: Unknown Name on a real-debug named skynet by pavan Sat
> > Jun 18 10:30:04 2022
> > [2]PETSC ERROR: Configure options --PETSC_ARCH=real-debug
> > --with-scalar-type=real --with-debugging=1
> > --with-mpi-dir=/home/pavan/packages/openmpi-4.0.7/opt-gfortran
> > --download-metis=yes --download-parmetis=yes
> > --download-superlu_dist=yes --with-shared-libraries=yes
> > --with-fortran-bindings=1 --with-cxx-dialect=C++11
> > [2]PETSC ERROR: #1 User provided function() at unknown file:0
> > [2]PETSC ERROR: Checking the memory for corruption.
> > [3]PETSC ERROR:
> > ------------------------------------------------------------------------
> > [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> > probably memory access out of range
> > [3]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> > [3]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind
> > [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> > OS X to find memory corruption errors
> > [3]PETSC ERROR: likely location of problem given in stack below
> > [3]PETSC ERROR: ---------------------  Stack Frames
> > ------------------------------------
> > [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not
> available,
> > [3]PETSC ERROR:       INSTEAD the line number of the start of the
> function
> > [3]PETSC ERROR:       is given.
> > [3]PETSC ERROR: --------------------- Error Message
> > --------------------------------------------------------------
> > [3]PETSC ERROR: Signal received
> > [3]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
> > [3]PETSC ERROR: Petsc Release Version 3.15.5, Sep 29, 2021
> > [3]PETSC ERROR: Unknown Name on a real-debug named skynet by pavan Sat
> > Jun 18 10:30:04 2022
> > [3]PETSC ERROR: Configure options --PETSC_ARCH=real-debug
> > --with-scalar-type=real --with-debugging=1
> > --with-mpi-dir=/home/pavan/packages/openmpi-4.0.7/opt-gfortran
> > --download-metis=yes --download-parmetis=yes
> > --download-superlu_dist=yes --with-shared-libraries=yes
> > --with-fortran-bindings=1 --with-cxx-dialect=C++11
> > [3]PETSC ERROR: #1 User provided function() at unknown file:0
> > [3]PETSC ERROR: Checking the memory for corruption.
> > [0]PETSC ERROR:
> > ------------------------------------------------------------------------
> > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> > probably memory access out of range
> > [0]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> > [0]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind
> > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> > OS X to find memory corruption errors
> > [0]PETSC ERROR: likely location of problem given in stack below
> > [0]PETSC ERROR: ---------------------  Stack Frames
> > ------------------------------------
> > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not
> available,
> > [0]PETSC ERROR:       INSTEAD the line number of the start of the
> function
> > [0]PETSC ERROR:       is given.
> > [0]PETSC ERROR: --------------------- Error Message
> > --------------------------------------------------------------
> > [0]PETSC ERROR: Signal received
> > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
> > [0]PETSC ERROR: Petsc Release Version 3.15.5, Sep 29, 2021
> > [0]PETSC ERROR: Unknown Name on a real-debug named skynet by pavan Sat
> > Jun 18 10:30:04 2022
> > [0]PETSC ERROR: Configure options --PETSC_ARCH=real-debug
> > --with-scalar-type=real --with-debugging=1
> > --with-mpi-dir=/home/pavan/packages/openmpi-4.0.7/opt-gfortran
> > --download-metis=yes --download-parmetis=yes
> > --download-superlu_dist=yes --with-shared-libraries=yes
> > --with-fortran-bindings=1 --with-cxx-dialect=C++11
> > [0]PETSC ERROR: #1 User provided function() at unknown file:0
> > [0]PETSC ERROR: Checking the memory for corruption.
> >
> > I am new to PETSc and I don't really know how to debug this. Any help
> > will be much appreciated!
> >
> > Regards,
> >
> > Pavan.
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220620/fd2aa7e8/attachment-0001.html>

From pierre.bernigaud at onera.fr  Tue Jun 21 10:14:46 2022
From: pierre.bernigaud at onera.fr (Bernigaud Pierre)
Date: Tue, 21 Jun 2022 17:14:46 +0200
Subject: [petsc-users] PETSc / AMRex
Message-ID: <cba4216d-9fea-72c3-0a7d-8148feb23af8@onera.fr>

Greetings,

I hope you are doing great.

We are currently working on parallel solver employing PETSc for the main 
numerical methods (GMRES, Newton-Krylov method). We would be interested 
in combining the PETSc solvers with the AMR framework provided by the 
library AMReX (https://amrex-codes.github.io/amrex/). I know that within 
the AMReX framework the KSP solvers provided by PETSc can be used, but 
what about the SNES solvers? More specifically, we are using a DMDA to 
manage parallel communications during the SNES calculations, and I am 
wondering how it would behave in a context where the data layout between 
processors is modified by the AMR code when refining the grid.

Would you have any experience on this matter ? Is there any 
collaboration going on between PETsc and AMReX, or would you know of a 
code using both of them?

Respectfully,

Pierre Bernigaud


From mfadams at lbl.gov  Tue Jun 21 11:00:42 2022
From: mfadams at lbl.gov (Mark Adams)
Date: Tue, 21 Jun 2022 12:00:42 -0400
Subject: [petsc-users] PETSc / AMRex
In-Reply-To: <cba4216d-9fea-72c3-0a7d-8148feb23af8@onera.fr>
References: <cba4216d-9fea-72c3-0a7d-8148feb23af8@onera.fr>
Message-ID: <CADOhEh5SAKLtBeB5rb7zC0qrXNdkSgFxcaprE-WCwuTsXFxrLA@mail.gmail.com>

Hi Bernigaud,

To be clear, you have SNES working with DMDA in AMRex, but without adapting
dynamically and you want to know what to do next.

Is that right?

Mark




On Tue, Jun 21, 2022 at 11:46 AM Bernigaud Pierre <pierre.bernigaud at onera.fr>
wrote:

> Greetings,
>
> I hope you are doing great.
>
> We are currently working on parallel solver employing PETSc for the main
> numerical methods (GMRES, Newton-Krylov method). We would be interested
> in combining the PETSc solvers with the AMR framework provided by the
> library AMReX (https://amrex-codes.github.io/amrex/). I know that within
> the AMReX framework the KSP solvers provided by PETSc can be used, but
> what about the SNES solvers? More specifically, we are using a DMDA to
> manage parallel communications during the SNES calculations, and I am
> wondering how it would behave in a context where the data layout between
> processors is modified by the AMR code when refining the grid.
>
> Would you have any experience on this matter ? Is there any
> collaboration going on between PETsc and AMReX, or would you know of a
> code using both of them?
>
> Respectfully,
>
> Pierre Bernigaud
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220621/fdf42046/attachment.html>

From knepley at gmail.com  Tue Jun 21 12:16:34 2022
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 21 Jun 2022 11:16:34 -0600
Subject: [petsc-users] PETSc / AMRex
In-Reply-To: <CADOhEh5SAKLtBeB5rb7zC0qrXNdkSgFxcaprE-WCwuTsXFxrLA@mail.gmail.com>
References: <cba4216d-9fea-72c3-0a7d-8148feb23af8@onera.fr>
	<CADOhEh5SAKLtBeB5rb7zC0qrXNdkSgFxcaprE-WCwuTsXFxrLA@mail.gmail.com>
Message-ID: <CAMYG4Gk_4D6sZZU6bMzr3QN4wd0OtVFXbAoZkCO_08Hw-UOvBQ@mail.gmail.com>

On Tue, Jun 21, 2022 at 10:01 AM Mark Adams <mfadams at lbl.gov> wrote:

> Hi Bernigaud,
>
> To be clear, you have SNES working with DMDA in AMRex, but
> without adapting dynamically and you want to know what to do next.
>
> Is that right?
>

I will let Mark answer the AMReX question since he is more knowledgeable.

I just wanted to note that PETSc has good integration with the p4est (
www.p4est.org) AMR package. We can manage all parallel data and
solver integration with it out of the box. Mark also has extensive
experience here.

  Thanks,

     Matt


> Mark
>


> On Tue, Jun 21, 2022 at 11:46 AM Bernigaud Pierre <
> pierre.bernigaud at onera.fr> wrote:
>
>> Greetings,
>>
>> I hope you are doing great.
>>
>> We are currently working on parallel solver employing PETSc for the main
>> numerical methods (GMRES, Newton-Krylov method). We would be interested
>> in combining the PETSc solvers with the AMR framework provided by the
>> library AMReX (https://amrex-codes.github.io/amrex/). I know that within
>> the AMReX framework the KSP solvers provided by PETSc can be used, but
>> what about the SNES solvers? More specifically, we are using a DMDA to
>> manage parallel communications during the SNES calculations, and I am
>> wondering how it would behave in a context where the data layout between
>> processors is modified by the AMR code when refining the grid.
>>
>> Would you have any experience on this matter ? Is there any
>> collaboration going on between PETsc and AMReX, or would you know of a
>> code using both of them?
>>
>> Respectfully,
>>
>> Pierre Bernigaud
>>
>>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220621/bbf4ee92/attachment.html>

From mfadams at lbl.gov  Tue Jun 21 12:57:19 2022
From: mfadams at lbl.gov (Mark Adams)
Date: Tue, 21 Jun 2022 13:57:19 -0400
Subject: [petsc-users] PETSc / AMRex
In-Reply-To: <06d9a338-f327-9724-4485-8cb8529b524c@onera.fr>
References: <cba4216d-9fea-72c3-0a7d-8148feb23af8@onera.fr>
	<CADOhEh5SAKLtBeB5rb7zC0qrXNdkSgFxcaprE-WCwuTsXFxrLA@mail.gmail.com>
	<06d9a338-f327-9724-4485-8cb8529b524c@onera.fr>
Message-ID: <CADOhEh4uNSZ4iv1ehLsbTadqxkcoqwEL+jupyMXS2wmWmiFWEg@mail.gmail.com>

(keep on the list, you will need Matt and Toby soon anyway).

So you want to add AMRex to your code.

I think the first thing that you want to do is move your DMDA code into a
DMPLex code. You can create a "box" mesh and it is not hard.
Others like Matt can give advice on how to get started on that translation.
There is a simple step to create a DMForest (p4/8est) that Matt mentioned
from the DMPlex .

Now at this point you can run your current SNES tests and get back to where
you started, but AMR is easy now.
Or as easy as it gets.

As far as AMRex, well, it's not clear what AMRex does for you at this
point.
You don't seem to have AMRex code that you want to reuse.
If there is some functionality that you need then we can talk about it or
if you have some programmatic reason to use it (eg, they are paying you)
then, again, we can talk about it.

PETSc/p4est and AMRex are similar with different strengths and design, and
you could use both but that would complicate things.

Hope that helps,
Mark


On Tue, Jun 21, 2022 at 1:18 PM Bernigaud Pierre <pierre.bernigaud at onera.fr>
wrote:

> Hello Mark,
>
> We have a working solver employing SNES, to which is attached a DMDA to
> handle ghost cells / data sharing between processors for flux evaluation
> (using DMGlobalToLocalBegin / DMGlobalToLocalEnd) . We are considering to
> add an AMReX layer to the solver, but no work has been done yet, as we are
> currently evaluating if it would be feasible without too much trouble.
>
> Our main subject of concern would be to understand how to interface
> correctly PETSc (SNES+DMDA) and AMRex, as AMRex also appears to have his
> own methods for parallel data management. Hence our inquiry for examples,
> just to get a feel for how it would work out.
>
> Best,
>
> Pierre
> Le 21/06/2022 ? 18:00, Mark Adams a ?crit :
>
> Hi Bernigaud,
>
> To be clear, you have SNES working with DMDA in AMRex, but
> without adapting dynamically and you want to know what to do next.
>
> Is that right?
>
> Mark
>
>
>
>
> On Tue, Jun 21, 2022 at 11:46 AM Bernigaud Pierre <
> pierre.bernigaud at onera.fr> wrote:
>
>> Greetings,
>>
>> I hope you are doing great.
>>
>> We are currently working on parallel solver employing PETSc for the main
>> numerical methods (GMRES, Newton-Krylov method). We would be interested
>> in combining the PETSc solvers with the AMR framework provided by the
>> library AMReX (https://amrex-codes.github.io/amrex/). I know that within
>> the AMReX framework the KSP solvers provided by PETSc can be used, but
>> what about the SNES solvers? More specifically, we are using a DMDA to
>> manage parallel communications during the SNES calculations, and I am
>> wondering how it would behave in a context where the data layout between
>> processors is modified by the AMR code when refining the grid.
>>
>> Would you have any experience on this matter ? Is there any
>> collaboration going on between PETsc and AMReX, or would you know of a
>> code using both of them?
>>
>> Respectfully,
>>
>> Pierre Bernigaud
>>
>> --
> *Pierre Bernigaud*
> Doctorant
> D?partement multi-physique pour l??nerg?tique
> Mod?lisation Propulsion Fus?e
> T?l: +33 1 80 38 62 33
>
>
> ONERA - The French Aerospace Lab - Centre de Palaiseau
> 6, Chemin de la Vauve aux Granges - 91123 PALAISEAU
> Coordonn?es GPS : 48.715169, 2.232833
>
> Nous suivre sur : www.onera.fr | Twitter
> <http://www.twitter.com/@onera_fr> |  LinkedIn
> <http://www.linkedin.com/company/onera> | Facebook
> <http://www.facebook.fr/thefrenchaerospacelab> | Instagram
> <https://www.instagram.com/onera_the_french_aerospace_lab>
>
>
> Avertissement/disclaimer https://www.onera.fr/en/emails-terms
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220621/0f00dba5/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: jlgjjjnkhffoclfc.gif
Type: image/gif
Size: 1041 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220621/0f00dba5/attachment-0001.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dldmcfkmcojhebgb.png
Type: image/png
Size: 16755 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220621/0f00dba5/attachment-0001.png>

From pierre.bernigaud at onera.fr  Wed Jun 22 12:50:26 2022
From: pierre.bernigaud at onera.fr (Pierre Bernigaud)
Date: Wed, 22 Jun 2022 19:50:26 +0200
Subject: [petsc-users] PETSc / AMRex
In-Reply-To: <CADOhEh4uNSZ4iv1ehLsbTadqxkcoqwEL+jupyMXS2wmWmiFWEg@mail.gmail.com>
References: <cba4216d-9fea-72c3-0a7d-8148feb23af8@onera.fr>
	<CADOhEh5SAKLtBeB5rb7zC0qrXNdkSgFxcaprE-WCwuTsXFxrLA@mail.gmail.com>
	<06d9a338-f327-9724-4485-8cb8529b524c@onera.fr>
	<CADOhEh4uNSZ4iv1ehLsbTadqxkcoqwEL+jupyMXS2wmWmiFWEg@mail.gmail.com>
Message-ID: <f85ec65629bd1be32272cb78719d5fdb@onera.fr>

Mark,

Thank you for this roadmap. It should be doable to go from a DMDA to a 
DMPLex code.
I wasn't aware of the existence of p4est. From what I've seen, it should 
fulfil our needs.

I will contact you again if we encounter any trouble.

Thanks again,
Pierre

Le 2022-06-21 19:57, Mark Adams a ?crit :

> (keep on the list, you will need Matt and Toby soon anyway).
> 
> So you want to add AMRex to your code.
> 
> I think the first thing that you want to do is move your DMDA code into 
> a DMPLex code. You can create a "box" mesh and it is not hard.
> Others like Matt can give advice on how to get started on that 
> translation.
> There is a simple step to create a DMForest (p4/8est) that Matt 
> mentioned from the DMPlex .
> 
> Now at this point you can run your current SNES tests and get back to 
> where you started, but AMR is easy now.
> Or as easy as it gets.
> 
> As far as AMRex, well, it's not clear what AMRex does for you at this 
> point.
> You don't seem to have AMRex code that you want to reuse.
> If there is some functionality that you need then we can talk about it 
> or if you have some programmatic reason to use it (eg, they are paying 
> you) then, again, we can talk about it.
> 
> PETSc/p4est and AMRex are similar with different strengths and design, 
> and you could use both but that would complicate things.
> 
> Hope that helps,
> Mark
> 
> On Tue, Jun 21, 2022 at 1:18 PM Bernigaud Pierre 
> <pierre.bernigaud at onera.fr> wrote:
> 
> Hello Mark,
> 
> We have a working solver employing SNES, to which is attached a DMDA to 
> handle ghost cells / data sharing between processors for flux 
> evaluation (using DMGlobalToLocalBegin / DMGlobalToLocalEnd) . We are 
> considering to add an AMReX layer to the solver, but no work has been 
> done yet, as we are currently evaluating if it would be feasible 
> without too much trouble.
> 
> Our main subject of concern would be to understand how to interface 
> correctly PETSc (SNES+DMDA) and AMRex, as AMRex also appears to have 
> his own methods for parallel data management. Hence our inquiry for 
> examples, just to get a feel for how it would work out.
> 
> Best,
> 
> Pierre
> 
> Le 21/06/2022 ? 18:00, Mark Adams a ?crit :
> Hi Bernigaud,
> 
> To be clear, you have SNES working with DMDA in AMRex, but without 
> adapting dynamically and you want to know what to do next.
> 
> Is that right?
> 
> Mark
> 
> On Tue, Jun 21, 2022 at 11:46 AM Bernigaud Pierre 
> <pierre.bernigaud at onera.fr> wrote: Greetings,
> 
> I hope you are doing great.
> 
> We are currently working on parallel solver employing PETSc for the 
> main
> numerical methods (GMRES, Newton-Krylov method). We would be interested
> in combining the PETSc solvers with the AMR framework provided by the
> library AMReX (https://amrex-codes.github.io/amrex/). I know that 
> within
> the AMReX framework the KSP solvers provided by PETSc can be used, but
> what about the SNES solvers? More specifically, we are using a DMDA to
> manage parallel communications during the SNES calculations, and I am
> wondering how it would behave in a context where the data layout 
> between
> processors is modified by the AMR code when refining the grid.
> 
> Would you have any experience on this matter ? Is there any
> collaboration going on between PETsc and AMReX, or would you know of a
> code using both of them?
> 
> Respectfully,
> 
> Pierre Bernigaud
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220622/fc2efe39/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: jlgjjjnkhffoclfc.gif
Type: image/gif
Size: 1041 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220622/fc2efe39/attachment.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dldmcfkmcojhebgb.png
Type: image/png
Size: 16755 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220622/fc2efe39/attachment.png>

From FERRANJ2 at my.erau.edu  Fri Jun 24 12:52:27 2022
From: FERRANJ2 at my.erau.edu (Ferrand, Jesus A.)
Date: Fri, 24 Jun 2022 17:52:27 +0000
Subject: [petsc-users] [EXTERNAL] Re: DMPlex/PetscSF How to determine if
 local topology is other rank's ghost?
In-Reply-To: <CAMYG4G=qAFhRtxOfzi9Dhc8e2N9yAo3bxgpN6YxdyjcxDdcS8g@mail.gmail.com>
References: <MW4PR02MB721771D86FB197FC6E90325CCDCE9@MW4PR02MB7217.namprd02.prod.outlook.com>
	<CAC2U1y99LyNanDnGdYN8OwG0tPjN13LDVfrvz=uRowrM17t-7w@mail.gmail.com>
	<CAMYG4G=qAFhRtxOfzi9Dhc8e2N9yAo3bxgpN6YxdyjcxDdcS8g@mail.gmail.com>
Message-ID: <PH0PR02MB72231FD7A5D994CCFD2FDA7ACDB49@PH0PR02MB7223.namprd02.prod.outlook.com>

Toby and Matt:

Thank you for your helpful replies.
In principle, I have what I need, however, I ran into a bug with PetscSFReduce.
When I run the following on the pointSF from a distributed plex (2 MPI ranks on a small mesh).

//==============================================================================================
PetscSFGetGraph(point_sf,&nroots,&nleaves,&ilocal,&iremote);
PetscCalloc2(nleaves,&leafdata,nroots,&rootdata);
\* Code that populates leafdata*/
PetscSFReduceBegin(point_sf,MPIU_INT,leafdata, rootdata,MPI_SUM);
PetscSFReduceEnd(point_sf,MPIU_INT,leafdata, rootdata,MPI_SUM);

PetscSFView(point_sf,0);
PetscViewerASCIIPrintf(PETSC_VIEWER_STDOUT_WORLD,"## Reduce Leafdata\n"); //I copied this from a PetscSF example.
PetscIntView(nleaves,leafdata,PETSC_VIEWER_STDOUT_WORLD);
PetscViewerASCIIPrintf(PETSC_VIEWER_STDOUT_WORLD,"## Reduce Rootdata\n");
PetscIntView(nroots,rootdata,PETSC_VIEWER_STDOUT_WORLD);
PetscFree2(leafdata,rootdata);
//==============================================================================================

.... I get the following printout :

//======================================
PetscSF Object: 2 MPI processes
  type: basic
  [0] Number of roots=29, leaves=5, remote ranks=1
  [0] 9 <- (1,9)
  [0] 11 <- (1,10)
  [0] 12 <- (1,13)
  [0] 20 <- (1,20)
  [0] 27 <- (1,27)
  [1] Number of roots=29, leaves=2, remote ranks=1
  [1] 14 <- (0,13)
  [1] 19 <- (0,18)
  MultiSF sort=rank-order
## Reduce Leafdata
[0] 0: 2 2 2 0 0
[1] 0: 3 0
## Reduce Rootdata
[0] 0: 0 0 0 0 0 0 0 0 0 0 0 0 0 -686563120 0 0 0 0 0 0
[0] 20: 0 0 0 0 0 0 0 0 0
[1] 0: 0 0 0 0 0 0 0 0 0 0 0 0 0 128 0 0 0 0 0 0
[1] 20: -527386800 0 0 0 0 0 0 32610 0
//======================================

The good news is that the rootdata on both processors has the correct number of nonzeros after reduction.
The bad news is that the nonzeros are garbage (like what one gets when a variable isn't initialized).
Any ideas as to what could cause this? Could something like a previous call to a PetscSF or DMPlex function do this?

I am still using PETSc version 3.16, but I looked at the patch notes of 3.17 and did not see any updates on PetscSFReduce().
________________________________
From: Matthew Knepley <knepley at gmail.com>
Sent: Wednesday, May 18, 2022 2:09 AM
To: Toby Isaac <toby.isaac at gmail.com>
Cc: Ferrand, Jesus A. <FERRANJ2 at my.erau.edu>; petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: [EXTERNAL] Re: [petsc-users] DMPlex/PetscSF How to determine if local topology is other rank's ghost?

CAUTION: This email originated outside of Embry-Riddle Aeronautical University. Do not click links or open attachments unless you recognize the sender and know the content is safe.

On Tue, May 17, 2022 at 6:47 PM Toby Isaac <toby.isaac at gmail.com<mailto:toby.isaac at gmail.com>> wrote:
A leaf point is attached to a root point (in a star forest there are only leaves and roots), so that means that a root point would be the point that owns a degree of freedom and a leaf point would have a ghost value.

For a "point SF" of a DMPlex:

- Each process has a local numbering of mesh points (cells + edges + faces + vertices): they are all potential roots, so the number of these is what is returned by `nroots`.

- The number of ghost mesh points is `nleaves`.

- `ilocal` would be a list of the mesh points that are leaves (using the local numbering).

- For each leaf in `ilocal`, `iremote` describes the root it is attached to: which process it belongs to, and its id in *that* process's local numbering.

If you're trying to create dof numberings on your own, please consider PetscSectionCreateGlobalSection: <https://petsc.org/main/docs/manualpages/PetscSection/PetscSectionCreateGlobalSection/>.  You supply the PetscSF and a PetscSection which says how many dofs there are for each point and whether any have essential boundary conditions, and it computes a global PetscSection that tells you what the global id is for each dof on this process.

Toby is exactly right. Also, if you want global numbering of points you can use

  https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexCreatePointNumbering/

and there is a similar thing for jsut cells or vertices.

  Thanks,

    Matt

On Tue, May 17, 2022 at 7:26 PM Ferrand, Jesus A. <FERRANJ2 at my.erau.edu<mailto:FERRANJ2 at my.erau.edu>> wrote:
Dear PETSc team:

I am working with a non-overlapping distributed plex (i.e., when I call DMPlexDistribute(), I input overlap = 0), so only vertices and edges appear as ghosts to the local ranks.
For preallocation of a parallel global stiffness matrix for FEA, I want to determine which locally owned vertices are ghosts to another rank.

From reading the paper on PetscSF (https://ieeexplore.ieee.org/document/9442258) I think I can answer my question by inspecting the PetscSF returned by DMPlexDistribute() with PetscSFGetGraph(). I am just confused by the root/leaf and ilocal/iremote terminology.

I read the manual page on PetscSFGetGraph() (https://petsc.org/release/docs/manualpages/PetscSF/PetscSFGetGraph.html) and that gave me the impression that I need to PetscSFBcast() the point IDs from foreign ranks to the local ones.

Is this correct?


[https://ieeexplore.ieee.org/assets/img/ieee_logo_smedia_200X200.png]<https://ieeexplore.ieee.org/document/9442258>
The PetscSF Scalable Communication Layer | IEEE Journals & Magazine | IEEE Xplore<https://ieeexplore.ieee.org/document/9442258>
PetscSF, the communication component of the Portable, Extensible Toolkit for Scientific Computation (PETSc), is designed to provide PETSc's communication infrastructure suitable for exascale computers that utilize GPUs and other accelerators. PetscSF provides a simple application programming interface (API) for managing common communication patterns in scientific computations by using a star ...
ieeexplore.ieee.org<http://ieeexplore.ieee.org>
?




Sincerely:

J.A. Ferrand

Embry-Riddle Aeronautical University - Daytona Beach FL

M.Sc. Aerospace Engineering | May 2022

B.Sc. Aerospace Engineering

B.Sc. Computational Mathematics



Sigma Gamma Tau

Tau Beta Pi



Phone: (386)-843-1829

Email(s): ferranj2 at my.erau.edu<mailto:ferranj2 at my.erau.edu>

    jesus.ferrand at gmail.com<mailto:jesus.ferrand at gmail.com>


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220624/f1187b95/attachment-0001.html>

From toby.isaac at gmail.com  Fri Jun 24 13:09:40 2022
From: toby.isaac at gmail.com (Toby Isaac)
Date: Fri, 24 Jun 2022 14:09:40 -0400
Subject: [petsc-users] [EXTERNAL] Re: DMPlex/PetscSF How to determine if
 local topology is other rank's ghost?
In-Reply-To: <PH0PR02MB72231FD7A5D994CCFD2FDA7ACDB49@PH0PR02MB7223.namprd02.prod.outlook.com>
References: <MW4PR02MB721771D86FB197FC6E90325CCDCE9@MW4PR02MB7217.namprd02.prod.outlook.com>
	<CAC2U1y99LyNanDnGdYN8OwG0tPjN13LDVfrvz=uRowrM17t-7w@mail.gmail.com>
	<CAMYG4G=qAFhRtxOfzi9Dhc8e2N9yAo3bxgpN6YxdyjcxDdcS8g@mail.gmail.com>
	<PH0PR02MB72231FD7A5D994CCFD2FDA7ACDB49@PH0PR02MB7223.namprd02.prod.outlook.com>
Message-ID: <CAC2U1y-X97V6uwF=Mok9hJqdpLeB4KeW8D_5MQf01bvM_7RZWg@mail.gmail.com>

>
> //======================================
> PetscSF Object: 2 MPI processes
>   type: basic
>   [0] Number of roots=29, leaves=5, remote ranks=1
>   [0] 9 <- (1,9)
>   [0] 11 <- (1,10)
>   [0] 12 <- (1,13)
>   [0] 20 <- (1,20)
>   [0] 27 <- (1,27)
>   [1] Number of roots=29, leaves=2, remote ranks=1
>   [1] 14 <- (0,13)
>   [1] 19 <- (0,18)
>   MultiSF sort=rank-order
> ## Reduce Leafdata
> [0] 0: 2 2 2 0 0
> [1] 0: 3 0
> ## Reduce Rootdata
> [0] 0: 0 0 0 0 0 0 0 0 0 0 0 0 0 -686563120 0 0 0 0 0 0
> [0] 20: 0 0 0 0 0 0 0 0 0
> [1] 0: 0 0 0 0 0 0 0 0 0 0 0 0 0 128 0 0 0 0 0 0
> [1] 20: -527386800 0 0 0 0 0 0 32610 0

This sf is one where the leaves are numbered as though they are
sparsely drawn from a larger vector.

For example, `[0] 9 <- (1,9)` means that the leaf with local index 9
on rank 0 has a root at (rank 1, index 9); the next leaf is `[0] 11 <-
(1,10)`, meaning it has local index 11 and its root is at (rank 1,
index 10).
So `PetscSFReduceBegin()` is expecting to read the `leafdata` on rank
0 from indices 9, 11, ....

But you have given it a `leafdata` array that is just a contiguous
array that's the size of the number of leaves.

The index spaces for leaves and roots don't have to be the same, but
in the case of the point SF they always are.  You should make a change
like so:

```
--- PetscCalloc2(nleaves,&leafdata,nroots,&rootdata);
+++ PetscCalloc2(nroots,&leafdata,nroots,&rootdata);
```

From jed at jedbrown.org  Sun Jun 26 16:28:24 2022
From: jed at jedbrown.org (Jed Brown)
Date: Sun, 26 Jun 2022 15:28:24 -0600
Subject: [petsc-users] Load mesh as DMPlex along with Solution Fields
 obtained from External Codes
In-Reply-To: <CAEc7osb_k815Ngqq8QAG2i+kehR-4Gd5NBtf8Do2giotHfu=cg@mail.gmail.com>
References: <CAEc7osb_k815Ngqq8QAG2i+kehR-4Gd5NBtf8Do2giotHfu=cg@mail.gmail.com>
Message-ID: <87mtdz8687.fsf@jedbrown.org>

(Sorry you didn't get a reply earlier.) That's generally right: DM gives what is basically a distributed "function space". Actual fields in that space are Vecs. You can have one or more Vecs. The implementation of VecLoad will depend on the file format.

Mike Michell <mi.mike1021 at gmail.com> writes:

> Dear PETSc developer team,
>
> I am a user of PETSc DMPlex for a finite-volume solver. So far, I have
> loaded a mesh file made by Gmsh as a DMPlex object without pre-computed
> solution field.
> But what if I need to load the mesh as well as solution fields that are
> computed by other codes sharing the same physical domain, what is a smart
> way to do that? In other words, how can I load a DM object from a mesh file
> along with a defined solution field?
> I can think of that; load mesh to a DM object first, then declare a local
> (or global) vector to read & map the external solution field onto the PETSc
> data structure. But I can feel that this might not be the best way.
>
> Thanks,
> Mike

From mersoj2 at rpi.edu  Mon Jun 27 00:20:37 2022
From: mersoj2 at rpi.edu (Merson, Jacob Simon)
Date: Mon, 27 Jun 2022 05:20:37 +0000
Subject: [petsc-users] Fail function evaluation with SNES
Message-ID: <EBBA0629-6F82-4568-B518-F08DD8FEBBDA@rpi.edu>

Hi All,

I?m attempting to use the SNES solver with the finite element method. When I use the trust region or line search algorithms I?m not currently running into any problems and the solution matches a hand-coded newton solver. However, when I use other methods like quasi-newton, or newton-conjugate-graduent I end up with a guess that makes the element Jacobian negative causing issues with the residual (and Jacobian) evaluation.

For this circumstance is it possible to set an error code that specifies that the function evaluation has failed and have SNES try a different step?


Thanks for the help!
Jacob

From bsmith at petsc.dev  Mon Jun 27 05:17:00 2022
From: bsmith at petsc.dev (Barry Smith)
Date: Mon, 27 Jun 2022 06:17:00 -0400
Subject: [petsc-users] Fail function evaluation with SNES
In-Reply-To: <EBBA0629-6F82-4568-B518-F08DD8FEBBDA@rpi.edu>
References: <EBBA0629-6F82-4568-B518-F08DD8FEBBDA@rpi.edu>
Message-ID: <6FCDDA3F-60C2-4E95-84CA-B79A41F12983@petsc.dev>


  You would call SNESSetFunctionDomainError() or SNESSetJacobianDomainError() from within your function or Jacobian evaluation and then return from the function. This notifies SNES that the step it attempted is not acceptable to your functions.

  SNES may not be able to recover from its bad step. The simplest attempt to recover is to have SNES try a shorter step. If the bad steps come from, for example, negative pressures or other non-physical locations of the step you can try using SNESVISetVariableBounds()  and friends to tell SNES what steps to avoid.

   If you have particular cases where SNES cannot recover and you can share your code we can investigate improving the handling of this feature in SNES. 

  Barry

> On Jun 27, 2022, at 1:20 AM, Merson, Jacob Simon <mersoj2 at rpi.edu> wrote:
> 
> Hi All,
> 
> I?m attempting to use the SNES solver with the finite element method. When I use the trust region or line search algorithms I?m not currently running into any problems and the solution matches a hand-coded newton solver. However, when I use other methods like quasi-newton, or newton-conjugate-graduent I end up with a guess that makes the element Jacobian negative causing issues with the residual (and Jacobian) evaluation.
> 
> For this circumstance is it possible to set an error code that specifies that the function evaluation has failed and have SNES try a different step?
> 
> 
> Thanks for the help!
> Jacob


From mersoj2 at rpi.edu  Mon Jun 27 07:23:42 2022
From: mersoj2 at rpi.edu (Merson, Jacob Simon)
Date: Mon, 27 Jun 2022 12:23:42 +0000
Subject: [petsc-users] [EXTERNAL] Re: Fail function evaluation with SNES
In-Reply-To: <6FCDDA3F-60C2-4E95-84CA-B79A41F12983@petsc.dev>
References: <EBBA0629-6F82-4568-B518-F08DD8FEBBDA@rpi.edu>
	<6FCDDA3F-60C2-4E95-84CA-B79A41F12983@petsc.dev>
Message-ID: <318716D4-261D-4A87-862B-52F50F0CB48C@rpi.edu>

Wonderful thank you! This is exactly what I was looking for. 

?
Jacob Merson

> On Jun 27, 2022, at 6:17 AM, Barry Smith <bsmith at petsc.dev> wrote:
> 
> ?
>  You would call SNESSetFunctionDomainError() or SNESSetJacobianDomainError() from within your function or Jacobian evaluation and then return from the function. This notifies SNES that the step it attempted is not acceptable to your functions.
> 
>  SNES may not be able to recover from its bad step. The simplest attempt to recover is to have SNES try a shorter step. If the bad steps come from, for example, negative pressures or other non-physical locations of the step you can try using SNESVISetVariableBounds()  and friends to tell SNES what steps to avoid.
> 
>   If you have particular cases where SNES cannot recover and you can share your code we can investigate improving the handling of this feature in SNES. 
> 
>  Barry
> 
>> On Jun 27, 2022, at 1:20 AM, Merson, Jacob Simon <mersoj2 at rpi.edu> wrote:
>> 
>> Hi All,
>> 
>> I?m attempting to use the SNES solver with the finite element method. When I use the trust region or line search algorithms I?m not currently running into any problems and the solution matches a hand-coded newton solver. However, when I use other methods like quasi-newton, or newton-conjugate-graduent I end up with a guess that makes the element Jacobian negative causing issues with the residual (and Jacobian) evaluation.
>> 
>> For this circumstance is it possible to set an error code that specifies that the function evaluation has failed and have SNES try a different step?
>> 
>> 
>> Thanks for the help!
>> Jacob
> 

From bsmith at petsc.dev  Mon Jun 27 12:40:20 2022
From: bsmith at petsc.dev (Barry Smith)
Date: Mon, 27 Jun 2022 13:40:20 -0400
Subject: [petsc-users] [EXTERNAL]  Fail function evaluation with SNES
In-Reply-To: <318716D4-261D-4A87-862B-52F50F0CB48C@rpi.edu>
References: <EBBA0629-6F82-4568-B518-F08DD8FEBBDA@rpi.edu>
	<6FCDDA3F-60C2-4E95-84CA-B79A41F12983@petsc.dev>
	<318716D4-261D-4A87-862B-52F50F0CB48C@rpi.edu>
Message-ID: <2DC8D9AA-C378-46D2-8AE5-788FE541FCAE@petsc.dev>


  I forgot something, you can also use ``SNESLineSearchSetPreCheck()`` and ``SNESLineSearchSetPostCheck()`` to control properties
of the steps selected by `SNES`.



> On Jun 27, 2022, at 8:23 AM, Merson, Jacob Simon <mersoj2 at rpi.edu> wrote:
> 
> Wonderful thank you! This is exactly what I was looking for. 
> 
> ?
> Jacob Merson
> 
>> On Jun 27, 2022, at 6:17 AM, Barry Smith <bsmith at petsc.dev> wrote:
>> 
>> ?
>> You would call SNESSetFunctionDomainError() or SNESSetJacobianDomainError() from within your function or Jacobian evaluation and then return from the function. This notifies SNES that the step it attempted is not acceptable to your functions.
>> 
>> SNES may not be able to recover from its bad step. The simplest attempt to recover is to have SNES try a shorter step. If the bad steps come from, for example, negative pressures or other non-physical locations of the step you can try using SNESVISetVariableBounds()  and friends to tell SNES what steps to avoid.
>> 
>>  If you have particular cases where SNES cannot recover and you can share your code we can investigate improving the handling of this feature in SNES. 
>> 
>> Barry
>> 
>>> On Jun 27, 2022, at 1:20 AM, Merson, Jacob Simon <mersoj2 at rpi.edu> wrote:
>>> 
>>> Hi All,
>>> 
>>> I?m attempting to use the SNES solver with the finite element method. When I use the trust region or line search algorithms I?m not currently running into any problems and the solution matches a hand-coded newton solver. However, when I use other methods like quasi-newton, or newton-conjugate-graduent I end up with a guess that makes the element Jacobian negative causing issues with the residual (and Jacobian) evaluation.
>>> 
>>> For this circumstance is it possible to set an error code that specifies that the function evaluation has failed and have SNES try a different step?
>>> 
>>> 
>>> Thanks for the help!
>>> Jacob
>> 


From knepley at gmail.com  Tue Jun 28 06:46:56 2022
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 28 Jun 2022 07:46:56 -0400
Subject: [petsc-users] Load mesh as DMPlex along with Solution Fields
 obtained from External Codes
In-Reply-To: <CAEc7osb_k815Ngqq8QAG2i+kehR-4Gd5NBtf8Do2giotHfu=cg@mail.gmail.com>
References: <CAEc7osb_k815Ngqq8QAG2i+kehR-4Gd5NBtf8Do2giotHfu=cg@mail.gmail.com>
Message-ID: <CAMYG4G=PSnq-1cTAm-NgM9i7T4D0mSTZbEZhZ2e3mt4b91xsZA@mail.gmail.com>

On Wed, Jun 8, 2022 at 12:15 AM Mike Michell <mi.mike1021 at gmail.com> wrote:

> Dear PETSc developer team,
>
> I am a user of PETSc DMPlex for a finite-volume solver. So far, I have
> loaded a mesh file made by Gmsh as a DMPlex object without pre-computed
> solution field.
> But what if I need to load the mesh as well as solution fields that are
> computed by other codes sharing the same physical domain, what is a smart
> way to do that? In other words, how can I load a DM object from a mesh file
> along with a defined solution field?
> I can think of that; load mesh to a DM object first, then declare a local
> (or global) vector to read & map the external solution field onto the PETSc
> data structure. But I can feel that this might not be the best way.
>

Here was my idea for this. PetscSection is an abstraction for laying out
data over a DMPlex. In parallel, each local Section lays out local data,
and a PetscSF points "ghost" mesh points at the owner. From this we can
make a _global_ Section automatically that lays out globally consistent
data. Thus, in order to match an external layout, you need to:

1) Match the mesh topology with DMPlex

2) Match the mesh parallel layout with a PetscSF

3) Match the local data layout with a PetscSection (might require
specifying a permutation of the mesh points to the section)

Then you should be able to load your data with VecLoad(). Let me know if
this is unclear or does not work for you.

  Thanks,

     Matt


> Thanks,
> Mike
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220628/cd99c0c3/attachment.html>

From knepley at gmail.com  Tue Jun 28 12:17:33 2022
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 28 Jun 2022 13:17:33 -0400
Subject: [petsc-users] List of points with dof>0 in a PetscSection
In-Reply-To: <CDA10C09-5EB3-4DA0-AE64-D2003B3B661F@mcmaster.ca>
References: <CDA10C09-5EB3-4DA0-AE64-D2003B3B661F@mcmaster.ca>
Message-ID: <CAMYG4Gm-75Aw0-=XkRmdo=3Rznvz+Oos+Q52n3vSUkGf+7C9og@mail.gmail.com>

On Fri, Jun 10, 2022 at 4:06 PM Blaise Bourdin <bourdin at mcmaster.ca> wrote:

> Hi,
>
> Given a PetscSection, is there an easy way to get a list of point at which
> the number of dof is >0?
> For instance, when projecting over a FE space, I?d rather do a loop over
> such points than do a loop over all points in a DM, get the number of dof,
> and test if it is >0.
>

We do not have an index like this. There is always a tradeoff between
direct indexing (as we have now in Section) and indirect indexing (as you
would have
if you compressed the indices). For internal uses, the search would never
pay off I think. It would not be hard to make this up front if you think it
would be
beneficial. I have never seen a case where the loop takes much more time
than the compressed version.

  Thanks,

     Matt


> Regards,
> Blaise
> --
> Professor, Department of Mathematics & Statistics
> Hamilton Hall room 409A, McMaster University
> 1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada
> https://www.math.mcmaster.ca/bourdin | +1 (905) 525 9140 ext. 27243
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220628/d45ce1a1/attachment.html>

From balay at mcs.anl.gov  Wed Jun 29 02:46:23 2022
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 29 Jun 2022 13:16:23 +0530 (IST)
Subject: [petsc-users] petsc-3.17.3 now available
Message-ID: <d388b71f-80a8-4ed5-c8c0-b1d2b1cd409d@mcs.anl.gov>

Dear PETSc users,

The patch release petsc-3.17.3 is now available for download.

http://www.mcs.anl.gov/petsc/download/index.html

Satish



From Jiannan_Tu at uml.edu  Wed Jun 29 13:23:15 2022
From: Jiannan_Tu at uml.edu (Tu, Jiannan)
Date: Wed, 29 Jun 2022 18:23:15 +0000
Subject: [petsc-users] KPS and linear complex equation system
Message-ID: <BY5PR02MB6471712DE67727CE837E8FEAE7BB9@BY5PR02MB6471.namprd02.prod.outlook.com>

I have a quick question. Petsc can be configured with complex number. can KSP then be used to solve linear equations of complex number, that is, both the matrix elements and solutions are complex, directly without separation real and imaginary parts?

Thank you very much.

Best
Jiannan Tu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220629/39b18253/attachment.html>

From knepley at gmail.com  Wed Jun 29 13:58:27 2022
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 29 Jun 2022 14:58:27 -0400
Subject: [petsc-users] KPS and linear complex equation system
In-Reply-To: <BY5PR02MB6471712DE67727CE837E8FEAE7BB9@BY5PR02MB6471.namprd02.prod.outlook.com>
References: <BY5PR02MB6471712DE67727CE837E8FEAE7BB9@BY5PR02MB6471.namprd02.prod.outlook.com>
Message-ID: <CAMYG4Gmp0M7cDXabyJXz73s5seTQ6Qqd8rBinSK=f=jyK3sAZQ@mail.gmail.com>

On Wed, Jun 29, 2022 at 2:24 PM Tu, Jiannan <Jiannan_Tu at uml.edu> wrote:

> I have a quick question. Petsc can be configured with complex number. can
> KSP then be used to solve linear equations of complex number, that is, both
> the matrix elements and solutions are complex, directly without separation
> real and imaginary parts?
>

Yes. Not all solvers make sense in this mode, but some do.

  Thanks,

     Matt


> Thank you very much.
>
> Best
> Jiannan Tu
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220629/4d577b11/attachment.html>

From Jiannan_Tu at uml.edu  Wed Jun 29 22:04:48 2022
From: Jiannan_Tu at uml.edu (Tu, Jiannan)
Date: Thu, 30 Jun 2022 03:04:48 +0000
Subject: [petsc-users] KPS and linear complex equation system
In-Reply-To: <CAMYG4Gmp0M7cDXabyJXz73s5seTQ6Qqd8rBinSK=f=jyK3sAZQ@mail.gmail.com>
References: <BY5PR02MB6471712DE67727CE837E8FEAE7BB9@BY5PR02MB6471.namprd02.prod.outlook.com>
	<CAMYG4Gmp0M7cDXabyJXz73s5seTQ6Qqd8rBinSK=f=jyK3sAZQ@mail.gmail.com>
Message-ID: <DM6PR02MB6478B957A45477F1BA55304BE7BA9@DM6PR02MB6478.namprd02.prod.outlook.com>

Matt, thank very much you for the reply.

Jiannan

From: Matthew Knepley <knepley at gmail.com>
Sent: Wednesday, June 29, 2022 2:58 PM
To: Tu, Jiannan <Jiannan_Tu at uml.edu>
Cc: petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] KPS and linear complex equation system

On Wed, Jun 29, 2022 at 2:24 PM Tu, Jiannan <Jiannan_Tu at uml.edu<mailto:Jiannan_Tu at uml.edu>> wrote:
I have a quick question. Petsc can be configured with complex number. can KSP then be used to solve linear equations of complex number, that is, both the matrix elements and solutions are complex, directly without separation real and imaginary parts?

Yes. Not all solvers make sense in this mode, but some do.

  Thanks,

     Matt

Thank you very much.

Best
Jiannan Tu


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.cse.buffalo.edu%2F~knepley%2F&data=05%7C01%7CJiannan_Tu%40uml.edu%7C0009570c098f418a229c08da5a0161e8%7C4c25b8a617f746f983f054734ab81fb1%7C0%7C0%7C637921259223660199%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=hGIlXsWhVWqY6PuFWM8WKiwtE6WZw1XqizHdsocVRwY%3D&reserved=0>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220630/a60649a8/attachment-0001.html>