[petsc-users] [External] Re: MatVec on GPUs
Swarnava Ghosh
swarnava89 at gmail.com
Tue Oct 19 21:01:14 CDT 2021
Thanks, Matt!
Sincerely,
SG
On Tue, Oct 19, 2021 at 9:34 PM Matthew Knepley <knepley at gmail.com> wrote:
> On Tue, Oct 19, 2021 at 9:18 PM Swarnava Ghosh <swarnava89 at gmail.com>
> wrote:
>
>> Thank you Junchao! Is it possible to determine how much time is being
>> spent on data transfer from the CPU mem to the GPU mem from the log?
>>
>
> It looks like
>
> VecCUDACopyTo 891 1.1 1.5322e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 842 6.23e+01 0
> 0.00e+00 0
>
> VecCUDACopyFrom 891 1.1 1.5837e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 842
> 6.23e+01 0
>
> MatCUSPARSCopyTo 891 1.1 1.5229e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 842 1.93e+03 0
> 0.00e+00 0
>
> Thanks,
>
> Matt
>
>
>>
>> ************************************************************************************************************************
>>
>> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r
>> -fCourier9' to print this document ***
>>
>>
>> ************************************************************************************************************************
>>
>>
>> ---------------------------------------------- PETSc Performance Summary:
>> ----------------------------------------------
>>
>>
>> /ccsopen/home/swarnava/MiniApp_xl_cu/bin/sq on a named h49n15 with 4
>> processors, by swarnava Tue Oct 19 21:10:56 2021
>>
>> Using Petsc Release Version 3.15.0, Mar 30, 2021
>>
>>
>> Max Max/Min Avg Total
>>
>> Time (sec): 1.172e+02 1.000 1.172e+02
>>
>> Objects: 1.160e+02 1.000 1.160e+02
>>
>> Flop: 5.832e+10 1.125 5.508e+10 2.203e+11
>>
>> Flop/sec: 4.974e+08 1.125 4.698e+08 1.879e+09
>>
>> MPI Messages: 0.000e+00 0.000 0.000e+00 0.000e+00
>>
>> MPI Message Lengths: 0.000e+00 0.000 0.000e+00 0.000e+00
>>
>> MPI Reductions: 1.320e+02 1.000
>>
>>
>> Flop counting convention: 1 flop = 1 real number operation of type
>> (multiply/divide/add/subtract)
>>
>> e.g., VecAXPY() for real vectors of length N
>> --> 2N flop
>>
>> and VecAXPY() for complex vectors of length
>> N --> 8N flop
>>
>>
>> Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages
>> --- -- Message Lengths -- -- Reductions --
>>
>> Avg %Total Avg %Total Count %Total
>> Avg %Total Count %Total
>>
>> 0: Main Stage: 1.1725e+02 100.0% 2.2033e+11 100.0% 0.000e+00
>> 0.0% 0.000e+00 0.0% 1.140e+02 86.4%
>>
>>
>>
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> See the 'Profiling' chapter of the users' manual for details on
>> interpreting output.
>>
>> Phase summary info:
>>
>> Count: number of times phase was executed
>>
>> Time and Flop: Max - maximum over all processors
>>
>> Ratio - ratio of maximum to minimum over all processors
>>
>> Mess: number of messages sent
>>
>> AvgLen: average message length (bytes)
>>
>> Reduct: number of global reductions
>>
>> Global: entire computation
>>
>> Stage: stages of a computation. Set stages with PetscLogStagePush()
>> and PetscLogStagePop().
>>
>> %T - percent time in this phase %F - percent flop in this
>> phase
>>
>> %M - percent messages in this phase %L - percent message
>> lengths in this phase
>>
>> %R - percent reductions in this phase
>>
>> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time
>> over all processors)
>>
>> GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max
>> GPU time over all processors)
>>
>> CpuToGpu Count: total number of CPU to GPU copies per processor
>>
>> CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per
>> processor)
>>
>> GpuToCpu Count: total number of GPU to CPU copies per processor
>>
>> GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per
>> processor)
>>
>> GPU %F: percent flops on GPU in this event
>>
>>
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> Event Count Time (sec) Flop
>> --- Global --- --- Stage ---- Total GPU - CpuToGpu - -
>> GpuToCpu - GPU
>>
>> Max Ratio Max Ratio Max Ratio Mess AvgLen
>> Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count
>> Size %F
>>
>>
>> ---------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>>
>> --- Event Stage 0: Main Stage
>>
>>
>> BuildTwoSided 2 1.0 6.2501e-03145.1 0.00e+00 0.0 0.0e+00
>> 0.0e+00 2.0e+00 0 0 0 0 2 0 0 0 0 2 0 0 0
>> 0.00e+00 0 0.00e+00 0
>>
>> BuildTwoSidedF 2 1.0 6.2628e-03123.2 0.00e+00 0.0 0.0e+00
>> 0.0e+00 2.0e+00 0 0 0 0 2 0 0 0 0 2 0 0 0
>> 0.00e+00 0 0.00e+00 0
>>
>> VecDot 89991 1.1 3.4663e+00 1.2 1.67e+09 1.1 0.0e+00 0.0e+00
>> 0.0e+00 3 3 0 0 0 3 3 0 0 0 1816 1841 0 0.00e+00
>> 84992 6.80e-01 100
>>
>> VecNorm 89991 1.1 5.5282e+00 1.2 1.67e+09 1.1 0.0e+00 0.0e+00
>> 0.0e+00 4 3 0 0 0 4 3 0 0 0 1139 1148 0 0.00e+00
>> 84992 6.80e-01 100
>>
>> VecScale 89991 1.1 1.3902e+00 1.2 8.33e+08 1.1 0.0e+00 0.0e+00
>> 0.0e+00 1 1 0 0 0 1 1 0 0 0 2265 2343 84992 6.80e-01 0
>> 0.00e+00 100
>>
>> VecCopy 178201 1.1 2.9825e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 0 0 0.00e+00 0
>> 0.00e+00 0
>>
>> VecSet 3589 1.1 1.0195e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0
>> 0.00e+00 0
>>
>> VecAXPY 179091 1.1 2.7456e+00 1.2 3.32e+09 1.1 0.0e+00 0.0e+00
>> 0.0e+00 2 6 0 0 0 2 6 0 0 0 4564 4739 169142 1.35e+00
>> 0 0.00e+00 100
>>
>> VecCUDACopyTo 891 1.1 1.5322e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 842 6.23e+01 0
>> 0.00e+00 0
>>
>> VecCUDACopyFrom 891 1.1 1.5837e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 842
>> 6.23e+01 0
>>
>> DMCreateMat 5 1.0 7.3491e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 7.0e+00 1 0 0 0 5 1 0 0 0 6 0 0 0 0.00e+00 0
>> 0.00e+00 0
>>
>> SFSetGraph 5 1.0 3.5016e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0
>> 0.00e+00 0
>>
>> MatMult 89991 1.1 2.0423e+00 1.2 5.08e+10 1.1 0.0e+00 0.0e+00
>> 0.0e+00 2 87 0 0 0 2 87 0 0 0 94039 105680 1683 2.00e+03 0
>> 0.00e+00 100
>>
>> MatCopy 891 1.1 1.3600e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0
>> 0.00e+00 0
>>
>> MatConvert 2 1.0 1.0489e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 0 0.00e+00 0
>> 0.00e+00 0
>>
>> MatScale 2 1.0 2.7950e-04 1.3 3.18e+05 1.0 0.0e+00 0.0e+00
>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 4530 0 0 0.00e+00 0
>> 0.00e+00 0
>>
>> MatAssemblyBegin 7 1.0 6.3768e-0368.8 0.00e+00 0.0 0.0e+00 0.0e+00
>> 2.0e+00 0 0 0 0 2 0 0 0 0 2 0 0 0 0.00e+00 0
>> 0.00e+00 0
>>
>> MatAssemblyEnd 7 1.0 7.9870e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 4.0e+00 0 0 0 0 3 0 0 0 0 4 0 0 0 0.00e+00 0
>> 0.00e+00 0
>>
>> MatCUSPARSCopyTo 891 1.1 1.5229e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 842 1.93e+03 0
>> 0.00e+00 0
>>
>>
>> ---------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>> Memory usage is given in bytes:
>>
>>
>> Object Type Creations Destructions Memory Descendants'
>> Mem.
>>
>> Reports information only for process 0.
>>
>>
>> --- Event Stage 0: Main Stage
>>
>>
>> Vector 69 11 19112 0.
>>
>> Distributed Mesh 3 0 0 0.
>>
>> Index Set 12 10 187512 0.
>>
>> IS L to G Mapping 3 0 0 0.
>>
>> Star Forest Graph 11 0 0 0.
>>
>> Discrete System 3 0 0 0.
>>
>> Weak Form 3 0 0 0.
>>
>> Application Order 1 0 0 0.
>>
>> Matrix 8 0 0 0.
>>
>> Krylov Solver 1 0 0 0.
>>
>> Preconditioner 1 0 0 0.
>>
>> Viewer 1 0 0 0.
>>
>>
>> ========================================================================================================================
>>
>> Average time to get PetscTime(): 4.32e-08
>>
>> Average time for MPI_Barrier(): 9.94e-07
>>
>> Average time for zero size MPI_Send(): 4.20135e-05
>>
>>
>> Sincerely,
>>
>> SG
>>
>> On Tue, Oct 19, 2021 at 12:28 AM Junchao Zhang <junchao.zhang at gmail.com>
>> wrote:
>>
>>>
>>>
>>>
>>> On Mon, Oct 18, 2021 at 10:56 PM Swarnava Ghosh <swarnava89 at gmail.com>
>>> wrote:
>>>
>>>> I am trying the port parts of the following function on GPUs.
>>>> Essentially, the lines of codes between the two "TODO..." comments should
>>>> be executed on the device. Here is the function:
>>>>
>>>> PetscScalar CalculateSpectralNodesAndWeights(LSDFT_OBJ *pLsdft, int p,
>>>> int LIp)
>>>> {
>>>>
>>>> PetscInt N_qp;
>>>> N_qp = pLsdft->N_qp;
>>>>
>>>> int k;
>>>> PetscScalar *a, *b;
>>>> k=0;
>>>>
>>>> PetscMalloc(sizeof(PetscScalar)*(N_qp+1), &a);
>>>> PetscMalloc(sizeof(PetscScalar)*(N_qp+1), &b);
>>>>
>>>> /*
>>>> * TODO: COPY a, b, pLsdft->Vk, pLsdft->Vkm1, pLsdft->Vkp1,
>>>> pLsdft->LapPlusVeffOprloc, k,p,N_qp from HOST to DEVICE
>>>> * DO THE FOLLOWING OPERATIONS ON DEVICE
>>>> */
>>>>
>>>> //zero out vectors
>>>> VecZeroEntries(pLsdft->Vk);
>>>> VecZeroEntries(pLsdft->Vkm1);
>>>> VecZeroEntries(pLsdft->Vkp1);
>>>>
>>>> VecSetValue(pLsdft->Vkm1, p, 1.0, INSERT_VALUES);
>>>> MatMult(pLsdft->LapPlusVeffOprloc,pLsdft->Vkm1,pLsdft->Vk);
>>>> VecDot(pLsdft->Vkm1, pLsdft->Vk, &a[0]);
>>>> VecAXPY(pLsdft->Vk, -a[0], pLsdft->Vkm1);
>>>> VecNorm(pLsdft->Vk, NORM_2, &b[0]);
>>>> VecScale(pLsdft->Vk, 1.0 / b[0]);
>>>>
>>>> for (k = 0; k < N_qp; k++) {
>>>> MatMult(pLsdft->LapPlusVeffOprloc,pLsdft->Vk,pLsdft->Vkp1);
>>>> VecDot(pLsdft->Vk, pLsdft->Vkp1, &a[k + 1]);
>>>> VecAXPY(pLsdft->Vkp1, -a[k + 1], pLsdft->Vk);
>>>> VecAXPY(pLsdft->Vkp1, -b[k], pLsdft->Vkm1);
>>>> VecCopy(pLsdft->Vk, pLsdft->Vkm1);
>>>> VecNorm(pLsdft->Vkp1, NORM_2, &b[k + 1]);
>>>> VecCopy(pLsdft->Vkp1, pLsdft->Vk);
>>>> VecScale(pLsdft->Vk, 1.0 / b[k + 1]);
>>>> }
>>>>
>>>> /*
>>>> * TODO: Copy back a, b, pLsdft->Vk, pLsdft->Vkm1, pLsdft->Vkp1,
>>>> pLsdft->LapPlusVeffOprloc, k,p,N_qp from DEVICE to HOST
>>>> */
>>>>
>>>> /*
>>>> * Some operation with a, and b on HOST
>>>> *
>>>> */
>>>> TridiagEigenVecSolve_NodesAndWeights(pLsdft, a, b, N_qp, LIp); //
>>>> operation on the host
>>>>
>>>> // free a,b
>>>> PetscFree(a);
>>>> PetscFree(b);
>>>>
>>>> return 0;
>>>> }
>>>>
>>>> If I just use the command line options to set vectors Vk,Vkp1 and Vkm1
>>>> as cuda vectors and the matrix LapPlusVeffOprloc as aijcusparse, will the
>>>> lines of code between the two "TODO" comments be entirely executed on the
>>>> device?
>>>>
>>> yes, except VecSetValue(pLsdft->Vkm1, p, 1.0, INSERT_VALUES); which is
>>> done on CPU, by pulling down vector data from GPU to CPU and setting the
>>> value. Subsequent vector operations will push the updated vector data to
>>> GPU again.
>>>
>>>
>>>>
>>>> Sincerely,
>>>> Swarnava
>>>>
>>>>
>>>> On Mon, Oct 18, 2021 at 10:13 PM Swarnava Ghosh <swarnava89 at gmail.com>
>>>> wrote:
>>>>
>>>>> Thanks for the clarification, Junchao.
>>>>>
>>>>> Sincerely,
>>>>> Swarnava
>>>>>
>>>>> On Mon, Oct 18, 2021 at 10:08 PM Junchao Zhang <
>>>>> junchao.zhang at gmail.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Oct 18, 2021 at 8:47 PM Swarnava Ghosh <swarnava89 at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Junchao,
>>>>>>>
>>>>>>> If I want to pass command line options as -mymat_mat_type
>>>>>>> aijcusparse, should it be MatSetOptionsPrefix(A,"mymat"); or
>>>>>>> MatSetOptionsPrefix(A,"mymat_"); ? Could you please clarify?
>>>>>>>
>>>>>> my fault, it should be MatSetOptionsPrefix(A,"mymat_"), as seen in
>>>>>> mat/tests/ex62.c
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Sincerely,
>>>>>>> Swarnava
>>>>>>>
>>>>>>> On Mon, Oct 18, 2021 at 9:23 PM Junchao Zhang <
>>>>>>> junchao.zhang at gmail.com> wrote:
>>>>>>>
>>>>>>>> MatSetOptionsPrefix(A,"mymat")
>>>>>>>> VecSetOptionsPrefix(v,"myvec")
>>>>>>>>
>>>>>>>> --Junchao Zhang
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Oct 18, 2021 at 8:04 PM Chang Liu <cliu at pppl.gov> wrote:
>>>>>>>>
>>>>>>>>> Hi Junchao,
>>>>>>>>>
>>>>>>>>> Thank you for your answer. I tried MatConvert and it works. I
>>>>>>>>> didn't
>>>>>>>>> make it before because I forgot to convert a vector from mpi to
>>>>>>>>> mpicuda
>>>>>>>>> previously.
>>>>>>>>>
>>>>>>>>> For vector, there is no VecConvert to use, so I have to do
>>>>>>>>> VecDuplicate,
>>>>>>>>> VecSetType and VecCopy. Is there an easier option?
>>>>>>>>>
>>>>>>>> As Matt suggested, you could single out the matrix and vector with
>>>>>>>> options prefix and set their type on command line
>>>>>>>>
>>>>>>>> MatSetOptionsPrefix(A,"mymat");
>>>>>>>> VecSetOptionsPrefix(v,"myvec");
>>>>>>>>
>>>>>>>> Then, -mymat_mat_type aijcusparse -myvec_vec_type cuda
>>>>>>>>
>>>>>>>> A simpler code is to have the vector type automatically set by
>>>>>>>> MatCreateVecs(A,&v,NULL)
>>>>>>>>
>>>>>>>>
>>>>>>>>> Chang
>>>>>>>>>
>>>>>>>>> On 10/18/21 5:23 PM, Junchao Zhang wrote:
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > On Mon, Oct 18, 2021 at 3:42 PM Chang Liu via petsc-users
>>>>>>>>> > <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>
>>>>>>>>> wrote:
>>>>>>>>> >
>>>>>>>>> > Hi Matt,
>>>>>>>>> >
>>>>>>>>> > I have a related question. In my code I have many matrices
>>>>>>>>> and I only
>>>>>>>>> > want to have one living on GPU, the others still staying on
>>>>>>>>> CPU mem.
>>>>>>>>> >
>>>>>>>>> > I wonder if there is an easier way to copy a mpiaij matrix to
>>>>>>>>> > mpiaijcusparse (in other words, copy data to GPUs). I can
>>>>>>>>> think of
>>>>>>>>> > creating a new mpiaijcusparse matrix, and copying the data
>>>>>>>>> line by
>>>>>>>>> > line.
>>>>>>>>> > But I wonder if there is a better option.
>>>>>>>>> >
>>>>>>>>> > I have tried MatCopy and MatConvert but neither work.
>>>>>>>>> >
>>>>>>>>> > Did you use MatConvert(mat,matype,MAT_INPLACE_MATRIX,&mat)?
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > Chang
>>>>>>>>> >
>>>>>>>>> > On 10/17/21 7:50 PM, Matthew Knepley wrote:
>>>>>>>>> > > On Sun, Oct 17, 2021 at 7:12 PM Swarnava Ghosh
>>>>>>>>> > <swarnava89 at gmail.com <mailto:swarnava89 at gmail.com>
>>>>>>>>> > > <mailto:swarnava89 at gmail.com <mailto:swarnava89 at gmail.com>>>
>>>>>>>>> wrote:
>>>>>>>>> > >
>>>>>>>>> > > Do I need convert the MATSEQBAIJ to a cuda matrix in
>>>>>>>>> code?
>>>>>>>>> > >
>>>>>>>>> > >
>>>>>>>>> > > You would need a call to MatSetFromOptions() to take that
>>>>>>>>> type
>>>>>>>>> > from the
>>>>>>>>> > > command line, and not have
>>>>>>>>> > > the type hard-coded in your application. It is generally
>>>>>>>>> a bad
>>>>>>>>> > idea to
>>>>>>>>> > > hard code the implementation type.
>>>>>>>>> > >
>>>>>>>>> > > If I do it from command line, then are the other
>>>>>>>>> MatVec calls are
>>>>>>>>> > > ported onto CUDA? I have many MatVec calls in my
>>>>>>>>> code, but I
>>>>>>>>> > > specifically want to port just one call.
>>>>>>>>> > >
>>>>>>>>> > >
>>>>>>>>> > > You can give that one matrix an options prefix to isolate
>>>>>>>>> it.
>>>>>>>>> > >
>>>>>>>>> > > Thanks,
>>>>>>>>> > >
>>>>>>>>> > > Matt
>>>>>>>>> > >
>>>>>>>>> > > Sincerely,
>>>>>>>>> > > Swarnava
>>>>>>>>> > >
>>>>>>>>> > > On Sun, Oct 17, 2021 at 7:07 PM Junchao Zhang
>>>>>>>>> > > <junchao.zhang at gmail.com <mailto:
>>>>>>>>> junchao.zhang at gmail.com>
>>>>>>>>> > <mailto:junchao.zhang at gmail.com <mailto:
>>>>>>>>> junchao.zhang at gmail.com>>>
>>>>>>>>> > wrote:
>>>>>>>>> > >
>>>>>>>>> > > You can do that with command line options
>>>>>>>>> -mat_type
>>>>>>>>> > aijcusparse
>>>>>>>>> > > -vec_type cuda
>>>>>>>>> > >
>>>>>>>>> > > On Sun, Oct 17, 2021, 5:32 PM Swarnava Ghosh
>>>>>>>>> > > <swarnava89 at gmail.com <mailto:
>>>>>>>>> swarnava89 at gmail.com>
>>>>>>>>> > <mailto:swarnava89 at gmail.com <mailto:swarnava89 at gmail.com>>>
>>>>>>>>> wrote:
>>>>>>>>> > >
>>>>>>>>> > > Dear Petsc team,
>>>>>>>>> > >
>>>>>>>>> > > I had a query regarding using CUDA to
>>>>>>>>> accelerate a matrix
>>>>>>>>> > > vector product.
>>>>>>>>> > > I have a sequential sparse matrix
>>>>>>>>> (MATSEQBAIJ type).
>>>>>>>>> > I want
>>>>>>>>> > > to port a MatVec call onto GPUs. Is there any
>>>>>>>>> > code/example I
>>>>>>>>> > > can look at?
>>>>>>>>> > >
>>>>>>>>> > > Sincerely,
>>>>>>>>> > > SG
>>>>>>>>> > >
>>>>>>>>> > >
>>>>>>>>> > >
>>>>>>>>> > > --
>>>>>>>>> > > What most experimenters take for granted before they
>>>>>>>>> begin their
>>>>>>>>> > > experiments is infinitely more interesting than any
>>>>>>>>> results to which
>>>>>>>>> > > their experiments lead.
>>>>>>>>> > > -- Norbert Wiener
>>>>>>>>> > >
>>>>>>>>> > > https://www.cse.buffalo.edu/~knepley/
>>>>>>>>> > <https://www.cse.buffalo.edu/~knepley/>
>>>>>>>>> > <http://www.cse.buffalo.edu/~knepley/
>>>>>>>>> > <http://www.cse.buffalo.edu/~knepley/>>
>>>>>>>>> >
>>>>>>>>> > --
>>>>>>>>> > Chang Liu
>>>>>>>>> > Staff Research Physicist
>>>>>>>>> > +1 609 243 3438
>>>>>>>>> > cliu at pppl.gov <mailto:cliu at pppl.gov>
>>>>>>>>> > Princeton Plasma Physics Laboratory
>>>>>>>>> > 100 Stellarator Rd, Princeton NJ 08540, USA
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Chang Liu
>>>>>>>>> Staff Research Physicist
>>>>>>>>> +1 609 243 3438
>>>>>>>>> cliu at pppl.gov
>>>>>>>>> Princeton Plasma Physics Laboratory
>>>>>>>>> 100 Stellarator Rd, Princeton NJ 08540, USA
>>>>>>>>>
>>>>>>>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20211019/b910cf40/attachment-0001.html>
More information about the petsc-users
mailing list