[petsc-users] [External] Re: MatVec on GPUs

Matthew Knepley knepley at gmail.com
Tue Oct 19 20:34:28 CDT 2021


On Tue, Oct 19, 2021 at 9:18 PM Swarnava Ghosh <swarnava89 at gmail.com> wrote:

> Thank you Junchao! Is it possible to determine how much time is being
> spent on data transfer from the CPU mem to the GPU mem from the log?
>

It looks like

VecCUDACopyTo        891 1.1 1.5322e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0    842 6.23e+01    0
0.00e+00  0

VecCUDACopyFrom      891 1.1 1.5837e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00  842
6.23e+01  0

MatCUSPARSCopyTo     891 1.1 1.5229e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0    842 1.93e+03    0
0.00e+00  0

  Thanks,

     Matt


>
> ************************************************************************************************************************
>
> ***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r
> -fCourier9' to print this document            ***
>
>
> ************************************************************************************************************************
>
>
> ---------------------------------------------- PETSc Performance Summary:
> ----------------------------------------------
>
>
> /ccsopen/home/swarnava/MiniApp_xl_cu/bin/sq on a  named h49n15 with 4
> processors, by swarnava Tue Oct 19 21:10:56 2021
>
> Using Petsc Release Version 3.15.0, Mar 30, 2021
>
>
>                          Max       Max/Min     Avg       Total
>
> Time (sec):           1.172e+02     1.000   1.172e+02
>
> Objects:              1.160e+02     1.000   1.160e+02
>
> Flop:                 5.832e+10     1.125   5.508e+10  2.203e+11
>
> Flop/sec:             4.974e+08     1.125   4.698e+08  1.879e+09
>
> MPI Messages:         0.000e+00     0.000   0.000e+00  0.000e+00
>
> MPI Message Lengths:  0.000e+00     0.000   0.000e+00  0.000e+00
>
> MPI Reductions:       1.320e+02     1.000
>
>
> Flop counting convention: 1 flop = 1 real number operation of type
> (multiply/divide/add/subtract)
>
>                             e.g., VecAXPY() for real vectors of length N
> --> 2N flop
>
>                             and VecAXPY() for complex vectors of length N
> --> 8N flop
>
>
> Summary of Stages:   ----- Time ------  ----- Flop ------  --- Messages
> ---  -- Message Lengths --  -- Reductions --
>
>                         Avg     %Total     Avg     %Total    Count   %Total
>     Avg         %Total    Count   %Total
>
>  0:      Main Stage: 1.1725e+02 100.0%  2.2033e+11 100.0%  0.000e+00
> 0.0%  0.000e+00        0.0%  1.140e+02  86.4%
>
>
>
> ------------------------------------------------------------------------------------------------------------------------
>
> See the 'Profiling' chapter of the users' manual for details on
> interpreting output.
>
> Phase summary info:
>
>    Count: number of times phase was executed
>
>    Time and Flop: Max - maximum over all processors
>
>                   Ratio - ratio of maximum to minimum over all processors
>
>    Mess: number of messages sent
>
>    AvgLen: average message length (bytes)
>
>    Reduct: number of global reductions
>
>    Global: entire computation
>
>    Stage: stages of a computation. Set stages with PetscLogStagePush()
> and PetscLogStagePop().
>
>       %T - percent time in this phase         %F - percent flop in this
> phase
>
>       %M - percent messages in this phase     %L - percent message
> lengths in this phase
>
>       %R - percent reductions in this phase
>
>    Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time
> over all processors)
>
>    GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU
> time over all processors)
>
>    CpuToGpu Count: total number of CPU to GPU copies per processor
>
>    CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per
> processor)
>
>    GpuToCpu Count: total number of GPU to CPU copies per processor
>
>    GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per
> processor)
>
>    GPU %F: percent flops on GPU in this event
>
>
> ------------------------------------------------------------------------------------------------------------------------
>
> Event                Count      Time (sec)     Flop
>         --- Global ---  --- Stage ----  Total   GPU    - CpuToGpu -   -
> GpuToCpu - GPU
>
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   AvgLen
> Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s Mflop/s Count   Size   Count
>   Size  %F
>
>
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>
> --- Event Stage 0: Main Stage
>
>
> BuildTwoSided          2 1.0 6.2501e-03145.1 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.0e+00  0  0  0  0  2   0  0  0  0  2     0       0      0 0.00e+00    0
> 0.00e+00  0
>
> BuildTwoSidedF         2 1.0 6.2628e-03123.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.0e+00  0  0  0  0  2   0  0  0  0  2     0       0      0 0.00e+00    0
> 0.00e+00  0
>
> VecDot             89991 1.1 3.4663e+00 1.2 1.67e+09 1.1 0.0e+00 0.0e+00
> 0.0e+00  3  3  0  0  0   3  3  0  0  0  1816    1841      0 0.00e+00
> 84992 6.80e-01 100
>
> VecNorm            89991 1.1 5.5282e+00 1.2 1.67e+09 1.1 0.0e+00 0.0e+00
> 0.0e+00  4  3  0  0  0   4  3  0  0  0  1139    1148      0 0.00e+00
> 84992 6.80e-01 100
>
> VecScale           89991 1.1 1.3902e+00 1.2 8.33e+08 1.1 0.0e+00 0.0e+00
> 0.0e+00  1  1  0  0  0   1  1  0  0  0  2265    2343   84992 6.80e-01    0
> 0.00e+00 100
>
> VecCopy           178201 1.1 2.9825e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  2  0  0  0  0   2  0  0  0  0     0       0      0 0.00e+00    0
> 0.00e+00  0
>
> VecSet              3589 1.1 1.0195e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0
> 0.00e+00  0
>
> VecAXPY           179091 1.1 2.7456e+00 1.2 3.32e+09 1.1 0.0e+00 0.0e+00
> 0.0e+00  2  6  0  0  0   2  6  0  0  0  4564    4739   169142 1.35e+00    0
> 0.00e+00 100
>
> VecCUDACopyTo        891 1.1 1.5322e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0    842 6.23e+01    0
> 0.00e+00  0
>
> VecCUDACopyFrom      891 1.1 1.5837e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00  842
> 6.23e+01  0
>
> DMCreateMat            5 1.0 7.3491e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 7.0e+00  1  0  0  0  5   1  0  0  0  6     0       0      0 0.00e+00    0
> 0.00e+00  0
>
> SFSetGraph             5 1.0 3.5016e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0
> 0.00e+00  0
>
> MatMult            89991 1.1 2.0423e+00 1.2 5.08e+10 1.1 0.0e+00 0.0e+00
> 0.0e+00  2 87  0  0  0   2 87  0  0  0 94039   105680   1683 2.00e+03    0
> 0.00e+00 100
>
> MatCopy              891 1.1 1.3600e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00    0
> 0.00e+00  0
>
> MatConvert             2 1.0 1.0489e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  1  0  0  0  0   1  0  0  0  0     0       0      0 0.00e+00    0
> 0.00e+00  0
>
> MatScale               2 1.0 2.7950e-04 1.3 3.18e+05 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0  4530       0      0 0.00e+00    0
> 0.00e+00  0
>
> MatAssemblyBegin       7 1.0 6.3768e-0368.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.0e+00  0  0  0  0  2   0  0  0  0  2     0       0      0 0.00e+00    0
> 0.00e+00  0
>
> MatAssemblyEnd         7 1.0 7.9870e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 4.0e+00  0  0  0  0  3   0  0  0  0  4     0       0      0 0.00e+00    0
> 0.00e+00  0
>
> MatCUSPARSCopyTo     891 1.1 1.5229e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0    842 1.93e+03    0
> 0.00e+00  0
>
>
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Memory usage is given in bytes:
>
>
> Object Type          Creations   Destructions     Memory  Descendants'
> Mem.
>
> Reports information only for process 0.
>
>
> --- Event Stage 0: Main Stage
>
>
>               Vector    69             11        19112     0.
>
>     Distributed Mesh     3              0            0     0.
>
>            Index Set    12             10       187512     0.
>
>    IS L to G Mapping     3              0            0     0.
>
>    Star Forest Graph    11              0            0     0.
>
>      Discrete System     3              0            0     0.
>
>            Weak Form     3              0            0     0.
>
>    Application Order     1              0            0     0.
>
>               Matrix     8              0            0     0.
>
>        Krylov Solver     1              0            0     0.
>
>       Preconditioner     1              0            0     0.
>
>               Viewer     1              0            0     0.
>
>
> ========================================================================================================================
>
> Average time to get PetscTime(): 4.32e-08
>
> Average time for MPI_Barrier(): 9.94e-07
>
> Average time for zero size MPI_Send(): 4.20135e-05
>
>
> Sincerely,
>
> SG
>
> On Tue, Oct 19, 2021 at 12:28 AM Junchao Zhang <junchao.zhang at gmail.com>
> wrote:
>
>>
>>
>>
>> On Mon, Oct 18, 2021 at 10:56 PM Swarnava Ghosh <swarnava89 at gmail.com>
>> wrote:
>>
>>> I am trying the port parts of the following function on GPUs.
>>> Essentially, the lines of codes between the two "TODO..." comments should
>>> be executed on the device. Here is the function:
>>>
>>> PetscScalar CalculateSpectralNodesAndWeights(LSDFT_OBJ *pLsdft, int p,
>>> int LIp)
>>> {
>>>
>>>   PetscInt N_qp;
>>>   N_qp = pLsdft->N_qp;
>>>
>>>   int k;
>>>   PetscScalar *a, *b;
>>>   k=0;
>>>
>>>   PetscMalloc(sizeof(PetscScalar)*(N_qp+1), &a);
>>>   PetscMalloc(sizeof(PetscScalar)*(N_qp+1), &b);
>>>
>>>   /*
>>>    * TODO: COPY a, b, pLsdft->Vk, pLsdft->Vkm1, pLsdft->Vkp1,
>>> pLsdft->LapPlusVeffOprloc, k,p,N_qp from HOST to DEVICE
>>>    * DO THE FOLLOWING OPERATIONS ON DEVICE
>>>    */
>>>
>>>   //zero out vectors
>>>   VecZeroEntries(pLsdft->Vk);
>>>   VecZeroEntries(pLsdft->Vkm1);
>>>   VecZeroEntries(pLsdft->Vkp1);
>>>
>>>   VecSetValue(pLsdft->Vkm1, p, 1.0, INSERT_VALUES);
>>>   MatMult(pLsdft->LapPlusVeffOprloc,pLsdft->Vkm1,pLsdft->Vk);
>>>   VecDot(pLsdft->Vkm1, pLsdft->Vk, &a[0]);
>>>   VecAXPY(pLsdft->Vk, -a[0], pLsdft->Vkm1);
>>>   VecNorm(pLsdft->Vk, NORM_2, &b[0]);
>>>   VecScale(pLsdft->Vk, 1.0 / b[0]);
>>>
>>>   for (k = 0; k < N_qp; k++) {
>>>     MatMult(pLsdft->LapPlusVeffOprloc,pLsdft->Vk,pLsdft->Vkp1);
>>>     VecDot(pLsdft->Vk, pLsdft->Vkp1, &a[k + 1]);
>>>     VecAXPY(pLsdft->Vkp1, -a[k + 1], pLsdft->Vk);
>>>     VecAXPY(pLsdft->Vkp1, -b[k], pLsdft->Vkm1);
>>>     VecCopy(pLsdft->Vk, pLsdft->Vkm1);
>>>     VecNorm(pLsdft->Vkp1, NORM_2, &b[k + 1]);
>>>     VecCopy(pLsdft->Vkp1, pLsdft->Vk);
>>>     VecScale(pLsdft->Vk, 1.0 / b[k + 1]);
>>>   }
>>>
>>>   /*
>>>    * TODO: Copy back a, b, pLsdft->Vk, pLsdft->Vkm1, pLsdft->Vkp1,
>>> pLsdft->LapPlusVeffOprloc, k,p,N_qp from DEVICE to HOST
>>>    */
>>>
>>>   /*
>>>    * Some operation with a, and b on HOST
>>>    *
>>>    */
>>>   TridiagEigenVecSolve_NodesAndWeights(pLsdft, a, b, N_qp, LIp);  //
>>> operation on the host
>>>
>>>   // free a,b
>>>   PetscFree(a);
>>>   PetscFree(b);
>>>
>>>   return 0;
>>> }
>>>
>>> If I just use the command line options to set vectors Vk,Vkp1 and Vkm1
>>> as cuda vectors and the matrix  LapPlusVeffOprloc as aijcusparse, will the
>>> lines of code between the two "TODO" comments be entirely executed on the
>>> device?
>>>
>> yes, except  VecSetValue(pLsdft->Vkm1, p, 1.0, INSERT_VALUES);  which is
>> done on CPU, by pulling down vector data from GPU to CPU and setting the
>> value.  Subsequent vector operations will push the updated vector data to
>> GPU again.
>>
>>
>>>
>>> Sincerely,
>>> Swarnava
>>>
>>>
>>> On Mon, Oct 18, 2021 at 10:13 PM Swarnava Ghosh <swarnava89 at gmail.com>
>>> wrote:
>>>
>>>> Thanks for the clarification, Junchao.
>>>>
>>>> Sincerely,
>>>> Swarnava
>>>>
>>>> On Mon, Oct 18, 2021 at 10:08 PM Junchao Zhang <junchao.zhang at gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Oct 18, 2021 at 8:47 PM Swarnava Ghosh <swarnava89 at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Junchao,
>>>>>>
>>>>>> If I want to pass command line options as  -mymat_mat_type
>>>>>> aijcusparse, should it be MatSetOptionsPrefix(A,"mymat"); or
>>>>>> MatSetOptionsPrefix(A,"mymat_"); ? Could you please clarify?
>>>>>>
>>>>>  my fault, it should be MatSetOptionsPrefix(A,"mymat_"), as seen in
>>>>> mat/tests/ex62.c
>>>>>  Thanks
>>>>>
>>>>>
>>>>>>
>>>>>> Sincerely,
>>>>>> Swarnava
>>>>>>
>>>>>> On Mon, Oct 18, 2021 at 9:23 PM Junchao Zhang <
>>>>>> junchao.zhang at gmail.com> wrote:
>>>>>>
>>>>>>> MatSetOptionsPrefix(A,"mymat")
>>>>>>> VecSetOptionsPrefix(v,"myvec")
>>>>>>>
>>>>>>> --Junchao Zhang
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Oct 18, 2021 at 8:04 PM Chang Liu <cliu at pppl.gov> wrote:
>>>>>>>
>>>>>>>> Hi Junchao,
>>>>>>>>
>>>>>>>> Thank you for your answer. I tried MatConvert and it works. I
>>>>>>>> didn't
>>>>>>>> make it before because I forgot to convert a vector from mpi to
>>>>>>>> mpicuda
>>>>>>>> previously.
>>>>>>>>
>>>>>>>> For vector, there is no VecConvert to use, so I have to do
>>>>>>>> VecDuplicate,
>>>>>>>> VecSetType and VecCopy. Is there an easier option?
>>>>>>>>
>>>>>>>  As Matt suggested, you could single out the matrix and vector with
>>>>>>> options prefix and set their type on command line
>>>>>>>
>>>>>>> MatSetOptionsPrefix(A,"mymat");
>>>>>>> VecSetOptionsPrefix(v,"myvec");
>>>>>>>
>>>>>>> Then, -mymat_mat_type aijcusparse -myvec_vec_type cuda
>>>>>>>
>>>>>>> A simpler code is to have the vector type automatically set by
>>>>>>> MatCreateVecs(A,&v,NULL)
>>>>>>>
>>>>>>>
>>>>>>>> Chang
>>>>>>>>
>>>>>>>> On 10/18/21 5:23 PM, Junchao Zhang wrote:
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > On Mon, Oct 18, 2021 at 3:42 PM Chang Liu via petsc-users
>>>>>>>> > <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>>>>>>>> >
>>>>>>>> >     Hi Matt,
>>>>>>>> >
>>>>>>>> >     I have a related question. In my code I have many matrices
>>>>>>>> and I only
>>>>>>>> >     want to have one living on GPU, the others still staying on
>>>>>>>> CPU mem.
>>>>>>>> >
>>>>>>>> >     I wonder if there is an easier way to copy a mpiaij matrix to
>>>>>>>> >     mpiaijcusparse (in other words, copy data to GPUs). I can
>>>>>>>> think of
>>>>>>>> >     creating a new mpiaijcusparse matrix, and copying the data
>>>>>>>> line by
>>>>>>>> >     line.
>>>>>>>> >     But I wonder if there is a better option.
>>>>>>>> >
>>>>>>>> >     I have tried MatCopy and MatConvert but neither work.
>>>>>>>> >
>>>>>>>> > Did you use MatConvert(mat,matype,MAT_INPLACE_MATRIX,&mat)?
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >     Chang
>>>>>>>> >
>>>>>>>> >     On 10/17/21 7:50 PM, Matthew Knepley wrote:
>>>>>>>> >      > On Sun, Oct 17, 2021 at 7:12 PM Swarnava Ghosh
>>>>>>>> >     <swarnava89 at gmail.com <mailto:swarnava89 at gmail.com>
>>>>>>>> >      > <mailto:swarnava89 at gmail.com <mailto:swarnava89 at gmail.com>>>
>>>>>>>> wrote:
>>>>>>>> >      >
>>>>>>>> >      >     Do I need convert the MATSEQBAIJ to a cuda matrix in
>>>>>>>> code?
>>>>>>>> >      >
>>>>>>>> >      >
>>>>>>>> >      > You would need a call to MatSetFromOptions() to take that
>>>>>>>> type
>>>>>>>> >     from the
>>>>>>>> >      > command line, and not have
>>>>>>>> >      > the type hard-coded in your application. It is generally a
>>>>>>>> bad
>>>>>>>> >     idea to
>>>>>>>> >      > hard code the implementation type.
>>>>>>>> >      >
>>>>>>>> >      >     If I do it from command line, then are the other
>>>>>>>> MatVec calls are
>>>>>>>> >      >     ported onto CUDA? I have many MatVec calls in my code,
>>>>>>>> but I
>>>>>>>> >      >     specifically want to port just one call.
>>>>>>>> >      >
>>>>>>>> >      >
>>>>>>>> >      > You can give that one matrix an options prefix to isolate
>>>>>>>> it.
>>>>>>>> >      >
>>>>>>>> >      >    Thanks,
>>>>>>>> >      >
>>>>>>>> >      >       Matt
>>>>>>>> >      >
>>>>>>>> >      >     Sincerely,
>>>>>>>> >      >     Swarnava
>>>>>>>> >      >
>>>>>>>> >      >     On Sun, Oct 17, 2021 at 7:07 PM Junchao Zhang
>>>>>>>> >      >     <junchao.zhang at gmail.com <mailto:
>>>>>>>> junchao.zhang at gmail.com>
>>>>>>>> >     <mailto:junchao.zhang at gmail.com <mailto:
>>>>>>>> junchao.zhang at gmail.com>>>
>>>>>>>> >     wrote:
>>>>>>>> >      >
>>>>>>>> >      >         You can do that with command line options -mat_type
>>>>>>>> >     aijcusparse
>>>>>>>> >      >         -vec_type cuda
>>>>>>>> >      >
>>>>>>>> >      >         On Sun, Oct 17, 2021, 5:32 PM Swarnava Ghosh
>>>>>>>> >      >         <swarnava89 at gmail.com <mailto:swarnava89 at gmail.com
>>>>>>>> >
>>>>>>>> >     <mailto:swarnava89 at gmail.com <mailto:swarnava89 at gmail.com>>>
>>>>>>>> wrote:
>>>>>>>> >      >
>>>>>>>> >      >             Dear Petsc team,
>>>>>>>> >      >
>>>>>>>> >      >             I had a query regarding using CUDA to
>>>>>>>> accelerate a matrix
>>>>>>>> >      >             vector product.
>>>>>>>> >      >             I have a sequential sparse matrix
>>>>>>>> (MATSEQBAIJ type).
>>>>>>>> >     I want
>>>>>>>> >      >             to port a MatVec call onto GPUs. Is there any
>>>>>>>> >     code/example I
>>>>>>>> >      >             can look at?
>>>>>>>> >      >
>>>>>>>> >      >             Sincerely,
>>>>>>>> >      >             SG
>>>>>>>> >      >
>>>>>>>> >      >
>>>>>>>> >      >
>>>>>>>> >      > --
>>>>>>>> >      > What most experimenters take for granted before they begin
>>>>>>>> their
>>>>>>>> >      > experiments is infinitely more interesting than any
>>>>>>>> results to which
>>>>>>>> >      > their experiments lead.
>>>>>>>> >      > -- Norbert Wiener
>>>>>>>> >      >
>>>>>>>> >      > https://www.cse.buffalo.edu/~knepley/
>>>>>>>> >     <https://www.cse.buffalo.edu/~knepley/>
>>>>>>>> >     <http://www.cse.buffalo.edu/~knepley/
>>>>>>>> >     <http://www.cse.buffalo.edu/~knepley/>>
>>>>>>>> >
>>>>>>>> >     --
>>>>>>>> >     Chang Liu
>>>>>>>> >     Staff Research Physicist
>>>>>>>> >     +1 609 243 3438
>>>>>>>> >     cliu at pppl.gov <mailto:cliu at pppl.gov>
>>>>>>>> >     Princeton Plasma Physics Laboratory
>>>>>>>> >     100 Stellarator Rd, Princeton NJ 08540, USA
>>>>>>>> >
>>>>>>>>
>>>>>>>> --
>>>>>>>> Chang Liu
>>>>>>>> Staff Research Physicist
>>>>>>>> +1 609 243 3438
>>>>>>>> cliu at pppl.gov
>>>>>>>> Princeton Plasma Physics Laboratory
>>>>>>>> 100 Stellarator Rd, Princeton NJ 08540, USA
>>>>>>>>
>>>>>>>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20211019/73ac9135/attachment-0001.html>


More information about the petsc-users mailing list