[petsc-users] Matrix and vector type selection & memory allocation for efficient matrix import?

Matthew Knepley knepley at gmail.com
Fri Apr 20 15:37:43 CDT 2018


On Fri, Apr 20, 2018 at 4:28 PM, Zou, Ling <ling.zou at inl.gov> wrote:

> Mat, is this book you recommended?
>
> https://www.amazon.com/Using-MPI-Programming-Message-
> Passing-Engineering/dp/0262527391/ref=pd_lpo_sbs_14_
> img_0?_encoding=UTF8&psc=1&refRID=EYCV0H0J5EQ9M0GDKWFT
>

Yep. I think its the best one, although Victor's book is also nice

   http://pages.tacc.utexas.edu/~eijkhout/istc/istc.html

  Matt


> Thanks,
>
> Ling
>
> On Fri, Apr 20, 2018 at 2:17 PM, Matthew Knepley <knepley at gmail.com>
> wrote:
>
>> On Fri, Apr 20, 2018 at 4:07 PM, Klaus Burkart <k_burkart at yahoo.com>
>> wrote:
>>
>>> Sorry, I don't understand you:
>>>
>>> The example says:
>>>
>>> Step 1: Create the gobal matrix
>>>
>>>     MatCreate(...,&A); // no problem
>>>
>>> Step 2: Make it a parallel matrix
>>>
>>>     MatSetType(A,MATMPIAIJ); // no problem
>>>
>>> Step 3: Define the size of the global matrix and the number of rows per
>>> process IF this number is the same for all processes
>>>
>>>     MatSetSizes(A, N,n,N,N); In the example, I have the problem with n
>>> which is 3 or 2 depending on the process but I can only set n to 3 or 2 so
>>> it will be the wrong for at least one process
>>>
>>
>> 1) Get the book "Using MPI" just like I suggested. It explains this part
>> of parallel programming that you do not understand.
>>
>> 2) Suppose we have two processes P0 and P1. Here are the call made on
>> both processes for a matrix with 5 rows split 3, 2:
>>
>>  P0: MatSetSizes(A, 3, 3, 5, 5);
>>  P1: MatSetSizes(A, 2, 2, 5, 5);
>>
>> See how different processes give different numbers to the routine? This
>> is what SPMD programming is all about.
>>
>>   Thanks,
>>
>>     Matt
>>
>>
>>> Step 4: Preallocate memory for the d_nnz and o_nnz values which are
>>> different for each row (for each n)
>>>
>>>     MatMPIAIJSetPreallocation(A,0,d_nnz[n],0,o_nnz[n]); // How to do
>>> this repeatedly for all processes especially when using PETEC_DECIDE for n
>>> and m as shown in many examples in which case I don't even know the n per
>>> process
>>>
>>> I have to retrieve the relevant values (content and structure) from an
>>> application and I absolutely don't understand how to enter the exact same
>>> matrix structure as shown in the example (only much larger) into PETSc
>>> using the retrieved data?
>>>
>>> How would assign the d_nnz and o_nnz values of a global matrix on a 32
>>> core workstation to the 32 processes using MatMPIAIJSetPreallocation(A,0,d_nnz[n],0,o_nnz[n]);?
>>> (Which means in my case assigning 64 arrays with different content
>>> containing the d_nnz and o_nnz  values for the 32 processes)
>>>
>>> Would I use a for loop - but then how to get hold of the individual
>>> processes? Same for Step 3?
>>>
>>> I don't get the underlying idea about how to create a parallel matrix
>>> structure in PETSc - don't get me wrong I understand the layout "simply"
>>> not how to enter more than one local matrix it into PETSc.
>>>
>>> Klaus
>>>
>>> Am Freitag, 20. April 2018, 19:52:11 MESZ hat Matthew Knepley <
>>> knepley at gmail.com> Folgendes geschrieben:
>>>
>>>
>>> On Fri, Apr 20, 2018 at 1:30 PM, Klaus Burkart <k_burkart at yahoo.com>
>>> wrote:
>>>
>>> In my case N=M but n for process 0, 1, 2, 3,... no_processes-1 can be
>>> different from the nth process like in the example where the nth
>>> process=Proc2 and has only two rows while all other processes have three
>>> rows:
>>>
>>>
>>> Yes.
>>>
>>>
>>> Example from the PETSc webpage mentioned before:
>>>
>>>             1  2  0  |  0  3  0  |  0  4
>>>     Proc0   0  5  6  |  7  0  0  |  8  0
>>>             9  0 10  | 11  0  0  | 12  0
>>>     ------------------------------ -------
>>>            13  0 14  | 15 16 17  |  0  0
>>>     Proc1   0 18  0  | 19 20 21  |  0  0
>>>             0  0  0  | 22 23  0  | 24  0
>>>     ------------------------------ -------
>>>     Proc2  25 26 27  |  0  0 28  | 29  0
>>>            30  0  0  | 31 32 33  |  0 34
>>>
>>> and I need to enter different values for d_nnz and o_nnz for each row
>>> somewhere too
>>>
>>>      proc0: d_nnz = [2,2,2] and o_nnz = [2,2,2]
>>>      proc1: d_nnz = [3,3,2] and o_nnz = [2,1,1]
>>>      proc2: d_nnz = [1,1]   and o_nnz = [4,4]
>>>
>>> Each process only sets d_nnz and o_nnz for its LOCAL rows. Thus, it is
>>> exactly as shown above. Proc1 sets values
>>> only for rows 3-5.
>>>
>>>   Thanks,
>>>
>>>     Matt
>>>
>>>
>>> I simply can't identify the function(s) used to set the values for n,
>>> d_nnz and o_nnz for the individual local matrices allocated to all the
>>> processes if n isn't the same for all processes and d_nnz and o_nnz are
>>> different for each local matrix?
>>>
>>> Approach described on the PETSc webpage:
>>>
>>>     MatCreate <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mcs.anl.gov_petsc_petsc-2Dcurrent_docs_manualpages_Mat_MatCreate.html-23MatCreate&d=DwMFaQ&c=54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00&r=kuHHom1yjd94zUrBWecnYg&m=N7Qj9HPi7_sDdfpeUPyFGoLWR95huadKQYfwt6HFZTg&s=fuaWHR03gK4mnNvFfFyicLPfKIaRobbPAKA7KvdbbQM&e=>(...,&A);
>>>
>>>
>>>
>>>     MatSetType <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mcs.anl.gov_petsc_petsc-2Dcurrent_docs_manualpages_Mat_MatSetType.html-23MatSetType&d=DwMFaQ&c=54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00&r=kuHHom1yjd94zUrBWecnYg&m=N7Qj9HPi7_sDdfpeUPyFGoLWR95huadKQYfwt6HFZTg&s=Kh5W-9DyiSe82fN1OgujmDWs-IijI441g7ujXZGreiY&e=>(A,MATMPIAIJ <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mcs.anl.gov_petsc_petsc-2Dcurrent_docs_manualpages_Mat_MATMPIAIJ.html-23MATMPIAIJ&d=DwMFaQ&c=54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00&r=kuHHom1yjd94zUrBWecnYg&m=N7Qj9HPi7_sDdfpeUPyFGoLWR95huadKQYfwt6HFZTg&s=CpJmovpIlWJ4se4dYFhIrakh2Hve3t62RiWMcpRtL54&e=>);
>>>
>>>
>>>
>>>     MatSetSizes <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mcs.anl.gov_petsc_petsc-2Dcurrent_docs_manualpages_Mat_MatSetSizes.html-23MatSetSizes&d=DwMFaQ&c=54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00&r=kuHHom1yjd94zUrBWecnYg&m=N7Qj9HPi7_sDdfpeUPyFGoLWR95huadKQYfwt6HFZTg&s=rdrEy7Pg8fLeMoQveCRqUHzi3IpOfCjvBGB-gFL73Ws&e=>(A, m,n,M,N); // for the example above using this function would set the no. of rows for Proc2 to 3 but it's 2
>>>
>>>
>>>
>>>
>>>     MatMPIAIJSetPreallocation <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mcs.anl.gov_petsc_petsc-2Dcurrent_docs_manualpages_Mat_MatMPIAIJSetPreallocation.html-23MatMPIAIJSetPreallocation&d=DwMFaQ&c=54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00&r=kuHHom1yjd94zUrBWecnYg&m=N7Qj9HPi7_sDdfpeUPyFGoLWR95huadKQYfwt6HFZTg&s=RsveZcVp1senr6bpfVWoVoin2d28-0D1000viYN2aqA&e=>(A,.. .); // this function can be used to set values for ONE local matrix only
>>>
>>>
>>>
>>> In addition to that I don't know which functions to use to preallocate
>>> memory for ALL local matrices when each of them has different values for
>>> d_nnz and o_nnz.
>>>
>>> I other words, what's the code for the 3 process example above?
>>> (entering the matrix structure and allocating memory)
>>>
>>> Klaus
>>>
>>> Am Freitag, 20. April 2018, 17:13:26 MESZ hat Smith, Barry F. <
>>> bsmith at mcs.anl.gov> Folgendes geschrieben:
>>>
>>>
>>>
>>>   For square matrices almost always n is the same as m. On different
>>> processes m can be different. You get to decide what makes sense for each
>>> processes what its m should be.
>>>
>>>   Barry
>>>
>>>
>>> > On Apr 20, 2018, at 10:05 AM, Klaus Burkart <k_burkart at yahoo.com>
>>> wrote:
>>> >
>>> > I think I understood the matrix structure for parallel computation
>>> with the rows, diagonal (d) and off-diagonal (o) structure, where I have
>>> problems is how to do the setup including memory allocation in PETSc:
>>> >
>>> > Lets assume, I use a 16 core workstation (=16 processes) and the
>>> number of nonzeros varies in each row for both d and o and the number of
>>> rows assigned to each process differs too - at least for the nth process.
>>> >
>>> > Looking at the manual and http://www.mcs.anl.gov/petsc/
>>> petsc-current/docs/ manualpages/Mat/MatCreateAIJ. html#MatCreateAIJ,
>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mcs.anl.gov_petsc_petsc-2Dcurrent_docs_manualpages_Mat_MatCreateAIJ.html-23MatCreateAIJ-2C&d=DwMFaQ&c=54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00&r=kuHHom1yjd94zUrBWecnYg&m=N7Qj9HPi7_sDdfpeUPyFGoLWR95huadKQYfwt6HFZTg&s=-_yf6zmL3d2iNa4ipS2vN3X9L0S7LxolfdXQI12_u7c&e=>I
>>> don't understand how to enter a global matrix when n is NOT the same for
>>> each process as e.g. in MatSetSizes(A, m,n,M,N); n and m are integers, not
>>> arrays?
>>> >
>>> >    MatCreate(...,&A);
>>> >
>>> >    MatSetType(A,MATMPIAIJ);
>>> >
>>> >    MatSetSizes(A, m,n,M,N); // seems to assume n and m are the same
>>> for each process which isn't even the case in the example on the page
>>> mentioned above?!
>>> >
>>> >    MatMPIAIJSetPreallocation(A,.. .);
>>> >
>>> >
>>> > How can I enter the parallel global-local matrix structure?
>>> >
>>> > How can the memory preallocation be done?
>>> >
>>> > Klaus
>>> >
>>> > Am Donnerstag, 19. April 2018, 01:47:59 MESZ hat Smith, Barry F. <
>>> bsmith at mcs.anl.gov> Folgendes geschrieben:
>>> >
>>> >
>>> >
>>> >
>>> > > On Apr 18, 2018, at 4:42 PM, k_burkart at yahoo.com wrote:
>>> > >
>>> > > So, practically speaking, l should invent routines to decompose the
>>> matrix e.g. into a block matrix structure to be able to make real use of
>>> PETSc ie. be able to solve a linear system using more than one process/core?
>>> >
>>> >  To really use PETSc efficiently/effectively you need to generate your
>>> matrix in parallel.
>>> >
>>> >  Barry
>>> >
>>> > >
>>> > > Klaus
>>> > >
>>> > > Von meinem Huawei-Mobiltelefon gesendet
>>> > >
>>> > >
>>> > > -------- Originalnachricht --------
>>> > > Betreff: Re: [petsc-users] Matrix and vector type selection & memory
>>> allocation for efficient matrix import?
>>> > > Von: "Smith, Barry F."
>>> > > An: Klaus Burkart
>>> > > Cc: PETSc Users List
>>> > >
>>> > >
>>> > >
>>> > > If you can only generate the nonzero allocation sequentially you can
>>> only solve sequentially which means your matrix is MATSEQAIJ and your
>>> vector is VECSEQ and your communicator is PETSC_COMM_SELF.
>>> > >
>>> > > If you pass and array for nnz, what you pass for nz is irrelevant,
>>> you might as well pass 0.
>>> > >
>>> > > Barry
>>> > >
>>> > >
>>> > > > On Apr 18, 2018, at 10:48 AM, Klaus Burkart wrote:
>>> > > >
>>> > > > More questions about matrix and vector type selection for my
>>> application:
>>> > > >
>>> > > > My starting point is a huge sparse matrix which can be symmetric
>>> or asymmetric and a rhs vector. There's no defined local or block structure
>>> at all, just row and column indices and the values and an array style rhs
>>> vector together describing the entire linear system to be solved. With
>>> quite some effort, I should be able to create an array nnz[N] containing
>>> the number of nonzeros per row in the global matrix for memory allocation
>>> which would leave me with MatSeqAIJSetPreallocation(M, 0, nnz); as the only
>>> option for efficient memory allocation ie. a MATSEQAIJ matrix and VECSEQ. I
>>> assume here, that 0 indicates different numbers of nonzero values in each
>>> row, the exact number being stored in the nnz array. Regarding this detail
>>> but one example assume a constant number of nz per row so I am not sure
>>> whether I should write 0 or NULL for nz?
>>> > > >
>>> > > > I started with:
>>> > > >
>>> > > > MatCreate(PETSC_COMM_WORLD, &M);
>>> > > > MatSetSizes(M, PETSC_DECIDE, PETSC_DECIDE, N, N);
>>> > > > MatSetFromOptions(M);
>>> > > >
>>> > > > taken from a paper and assume, the latter would set the matrix
>>> type to MATSEQAIJ which might conflict with PETSC_COMM_WORLD. Maybe
>>> decompositioning took place at an earlier stage and the authors of the
>>> paper were able to retrieve the local data and structure.
>>> > > >
>>> > > > What type of matrix and vector should I use for my application
>>> e.g. MATSEQAIJ and VECSEQ to be able to use MatSeqAIJSetPreallocation(M, 0,
>>> nnz); for efficient memory allocation?
>>> > > >
>>>
>>> > > > In this case, where would the decompositioning / MPI process
>>> allocation take place?
>>> > >
>>>
>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://www.cse.buffalo.edu/~knepley/
>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.caam.rice.edu_-7Emk51_&d=DwMFaQ&c=54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00&r=kuHHom1yjd94zUrBWecnYg&m=N7Qj9HPi7_sDdfpeUPyFGoLWR95huadKQYfwt6HFZTg&s=X1XTDkxAzSG3ajhCqaTWt8j4MtmpP7m1h8PAUX0xslA&e=>
>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.caam.rice.edu_-7Emk51_&d=DwMFaQ&c=54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00&r=kuHHom1yjd94zUrBWecnYg&m=N7Qj9HPi7_sDdfpeUPyFGoLWR95huadKQYfwt6HFZTg&s=X1XTDkxAzSG3ajhCqaTWt8j4MtmpP7m1h8PAUX0xslA&e=>
>>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.caam.rice.edu/~mk51/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20180420/e9d2b301/attachment-0001.html>


More information about the petsc-users mailing list