[petsc-users] Matrix and vector type selection & memory allocation for efficient matrix import?
Matthew Knepley
knepley at gmail.com
Fri Apr 20 15:37:43 CDT 2018
On Fri, Apr 20, 2018 at 4:28 PM, Zou, Ling <ling.zou at inl.gov> wrote:
> Mat, is this book you recommended?
>
> https://www.amazon.com/Using-MPI-Programming-Message-
> Passing-Engineering/dp/0262527391/ref=pd_lpo_sbs_14_
> img_0?_encoding=UTF8&psc=1&refRID=EYCV0H0J5EQ9M0GDKWFT
>
Yep. I think its the best one, although Victor's book is also nice
http://pages.tacc.utexas.edu/~eijkhout/istc/istc.html
Matt
> Thanks,
>
> Ling
>
> On Fri, Apr 20, 2018 at 2:17 PM, Matthew Knepley <knepley at gmail.com>
> wrote:
>
>> On Fri, Apr 20, 2018 at 4:07 PM, Klaus Burkart <k_burkart at yahoo.com>
>> wrote:
>>
>>> Sorry, I don't understand you:
>>>
>>> The example says:
>>>
>>> Step 1: Create the gobal matrix
>>>
>>> MatCreate(...,&A); // no problem
>>>
>>> Step 2: Make it a parallel matrix
>>>
>>> MatSetType(A,MATMPIAIJ); // no problem
>>>
>>> Step 3: Define the size of the global matrix and the number of rows per
>>> process IF this number is the same for all processes
>>>
>>> MatSetSizes(A, N,n,N,N); In the example, I have the problem with n
>>> which is 3 or 2 depending on the process but I can only set n to 3 or 2 so
>>> it will be the wrong for at least one process
>>>
>>
>> 1) Get the book "Using MPI" just like I suggested. It explains this part
>> of parallel programming that you do not understand.
>>
>> 2) Suppose we have two processes P0 and P1. Here are the call made on
>> both processes for a matrix with 5 rows split 3, 2:
>>
>> P0: MatSetSizes(A, 3, 3, 5, 5);
>> P1: MatSetSizes(A, 2, 2, 5, 5);
>>
>> See how different processes give different numbers to the routine? This
>> is what SPMD programming is all about.
>>
>> Thanks,
>>
>> Matt
>>
>>
>>> Step 4: Preallocate memory for the d_nnz and o_nnz values which are
>>> different for each row (for each n)
>>>
>>> MatMPIAIJSetPreallocation(A,0,d_nnz[n],0,o_nnz[n]); // How to do
>>> this repeatedly for all processes especially when using PETEC_DECIDE for n
>>> and m as shown in many examples in which case I don't even know the n per
>>> process
>>>
>>> I have to retrieve the relevant values (content and structure) from an
>>> application and I absolutely don't understand how to enter the exact same
>>> matrix structure as shown in the example (only much larger) into PETSc
>>> using the retrieved data?
>>>
>>> How would assign the d_nnz and o_nnz values of a global matrix on a 32
>>> core workstation to the 32 processes using MatMPIAIJSetPreallocation(A,0,d_nnz[n],0,o_nnz[n]);?
>>> (Which means in my case assigning 64 arrays with different content
>>> containing the d_nnz and o_nnz values for the 32 processes)
>>>
>>> Would I use a for loop - but then how to get hold of the individual
>>> processes? Same for Step 3?
>>>
>>> I don't get the underlying idea about how to create a parallel matrix
>>> structure in PETSc - don't get me wrong I understand the layout "simply"
>>> not how to enter more than one local matrix it into PETSc.
>>>
>>> Klaus
>>>
>>> Am Freitag, 20. April 2018, 19:52:11 MESZ hat Matthew Knepley <
>>> knepley at gmail.com> Folgendes geschrieben:
>>>
>>>
>>> On Fri, Apr 20, 2018 at 1:30 PM, Klaus Burkart <k_burkart at yahoo.com>
>>> wrote:
>>>
>>> In my case N=M but n for process 0, 1, 2, 3,... no_processes-1 can be
>>> different from the nth process like in the example where the nth
>>> process=Proc2 and has only two rows while all other processes have three
>>> rows:
>>>
>>>
>>> Yes.
>>>
>>>
>>> Example from the PETSc webpage mentioned before:
>>>
>>> 1 2 0 | 0 3 0 | 0 4
>>> Proc0 0 5 6 | 7 0 0 | 8 0
>>> 9 0 10 | 11 0 0 | 12 0
>>> ------------------------------ -------
>>> 13 0 14 | 15 16 17 | 0 0
>>> Proc1 0 18 0 | 19 20 21 | 0 0
>>> 0 0 0 | 22 23 0 | 24 0
>>> ------------------------------ -------
>>> Proc2 25 26 27 | 0 0 28 | 29 0
>>> 30 0 0 | 31 32 33 | 0 34
>>>
>>> and I need to enter different values for d_nnz and o_nnz for each row
>>> somewhere too
>>>
>>> proc0: d_nnz = [2,2,2] and o_nnz = [2,2,2]
>>> proc1: d_nnz = [3,3,2] and o_nnz = [2,1,1]
>>> proc2: d_nnz = [1,1] and o_nnz = [4,4]
>>>
>>> Each process only sets d_nnz and o_nnz for its LOCAL rows. Thus, it is
>>> exactly as shown above. Proc1 sets values
>>> only for rows 3-5.
>>>
>>> Thanks,
>>>
>>> Matt
>>>
>>>
>>> I simply can't identify the function(s) used to set the values for n,
>>> d_nnz and o_nnz for the individual local matrices allocated to all the
>>> processes if n isn't the same for all processes and d_nnz and o_nnz are
>>> different for each local matrix?
>>>
>>> Approach described on the PETSc webpage:
>>>
>>> MatCreate <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mcs.anl.gov_petsc_petsc-2Dcurrent_docs_manualpages_Mat_MatCreate.html-23MatCreate&d=DwMFaQ&c=54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00&r=kuHHom1yjd94zUrBWecnYg&m=N7Qj9HPi7_sDdfpeUPyFGoLWR95huadKQYfwt6HFZTg&s=fuaWHR03gK4mnNvFfFyicLPfKIaRobbPAKA7KvdbbQM&e=>(...,&A);
>>>
>>>
>>>
>>> MatSetType <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mcs.anl.gov_petsc_petsc-2Dcurrent_docs_manualpages_Mat_MatSetType.html-23MatSetType&d=DwMFaQ&c=54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00&r=kuHHom1yjd94zUrBWecnYg&m=N7Qj9HPi7_sDdfpeUPyFGoLWR95huadKQYfwt6HFZTg&s=Kh5W-9DyiSe82fN1OgujmDWs-IijI441g7ujXZGreiY&e=>(A,MATMPIAIJ <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mcs.anl.gov_petsc_petsc-2Dcurrent_docs_manualpages_Mat_MATMPIAIJ.html-23MATMPIAIJ&d=DwMFaQ&c=54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00&r=kuHHom1yjd94zUrBWecnYg&m=N7Qj9HPi7_sDdfpeUPyFGoLWR95huadKQYfwt6HFZTg&s=CpJmovpIlWJ4se4dYFhIrakh2Hve3t62RiWMcpRtL54&e=>);
>>>
>>>
>>>
>>> MatSetSizes <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mcs.anl.gov_petsc_petsc-2Dcurrent_docs_manualpages_Mat_MatSetSizes.html-23MatSetSizes&d=DwMFaQ&c=54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00&r=kuHHom1yjd94zUrBWecnYg&m=N7Qj9HPi7_sDdfpeUPyFGoLWR95huadKQYfwt6HFZTg&s=rdrEy7Pg8fLeMoQveCRqUHzi3IpOfCjvBGB-gFL73Ws&e=>(A, m,n,M,N); // for the example above using this function would set the no. of rows for Proc2 to 3 but it's 2
>>>
>>>
>>>
>>>
>>> MatMPIAIJSetPreallocation <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mcs.anl.gov_petsc_petsc-2Dcurrent_docs_manualpages_Mat_MatMPIAIJSetPreallocation.html-23MatMPIAIJSetPreallocation&d=DwMFaQ&c=54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00&r=kuHHom1yjd94zUrBWecnYg&m=N7Qj9HPi7_sDdfpeUPyFGoLWR95huadKQYfwt6HFZTg&s=RsveZcVp1senr6bpfVWoVoin2d28-0D1000viYN2aqA&e=>(A,.. .); // this function can be used to set values for ONE local matrix only
>>>
>>>
>>>
>>> In addition to that I don't know which functions to use to preallocate
>>> memory for ALL local matrices when each of them has different values for
>>> d_nnz and o_nnz.
>>>
>>> I other words, what's the code for the 3 process example above?
>>> (entering the matrix structure and allocating memory)
>>>
>>> Klaus
>>>
>>> Am Freitag, 20. April 2018, 17:13:26 MESZ hat Smith, Barry F. <
>>> bsmith at mcs.anl.gov> Folgendes geschrieben:
>>>
>>>
>>>
>>> For square matrices almost always n is the same as m. On different
>>> processes m can be different. You get to decide what makes sense for each
>>> processes what its m should be.
>>>
>>> Barry
>>>
>>>
>>> > On Apr 20, 2018, at 10:05 AM, Klaus Burkart <k_burkart at yahoo.com>
>>> wrote:
>>> >
>>> > I think I understood the matrix structure for parallel computation
>>> with the rows, diagonal (d) and off-diagonal (o) structure, where I have
>>> problems is how to do the setup including memory allocation in PETSc:
>>> >
>>> > Lets assume, I use a 16 core workstation (=16 processes) and the
>>> number of nonzeros varies in each row for both d and o and the number of
>>> rows assigned to each process differs too - at least for the nth process.
>>> >
>>> > Looking at the manual and http://www.mcs.anl.gov/petsc/
>>> petsc-current/docs/ manualpages/Mat/MatCreateAIJ. html#MatCreateAIJ,
>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mcs.anl.gov_petsc_petsc-2Dcurrent_docs_manualpages_Mat_MatCreateAIJ.html-23MatCreateAIJ-2C&d=DwMFaQ&c=54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00&r=kuHHom1yjd94zUrBWecnYg&m=N7Qj9HPi7_sDdfpeUPyFGoLWR95huadKQYfwt6HFZTg&s=-_yf6zmL3d2iNa4ipS2vN3X9L0S7LxolfdXQI12_u7c&e=>I
>>> don't understand how to enter a global matrix when n is NOT the same for
>>> each process as e.g. in MatSetSizes(A, m,n,M,N); n and m are integers, not
>>> arrays?
>>> >
>>> > MatCreate(...,&A);
>>> >
>>> > MatSetType(A,MATMPIAIJ);
>>> >
>>> > MatSetSizes(A, m,n,M,N); // seems to assume n and m are the same
>>> for each process which isn't even the case in the example on the page
>>> mentioned above?!
>>> >
>>> > MatMPIAIJSetPreallocation(A,.. .);
>>> >
>>> >
>>> > How can I enter the parallel global-local matrix structure?
>>> >
>>> > How can the memory preallocation be done?
>>> >
>>> > Klaus
>>> >
>>> > Am Donnerstag, 19. April 2018, 01:47:59 MESZ hat Smith, Barry F. <
>>> bsmith at mcs.anl.gov> Folgendes geschrieben:
>>> >
>>> >
>>> >
>>> >
>>> > > On Apr 18, 2018, at 4:42 PM, k_burkart at yahoo.com wrote:
>>> > >
>>> > > So, practically speaking, l should invent routines to decompose the
>>> matrix e.g. into a block matrix structure to be able to make real use of
>>> PETSc ie. be able to solve a linear system using more than one process/core?
>>> >
>>> > To really use PETSc efficiently/effectively you need to generate your
>>> matrix in parallel.
>>> >
>>> > Barry
>>> >
>>> > >
>>> > > Klaus
>>> > >
>>> > > Von meinem Huawei-Mobiltelefon gesendet
>>> > >
>>> > >
>>> > > -------- Originalnachricht --------
>>> > > Betreff: Re: [petsc-users] Matrix and vector type selection & memory
>>> allocation for efficient matrix import?
>>> > > Von: "Smith, Barry F."
>>> > > An: Klaus Burkart
>>> > > Cc: PETSc Users List
>>> > >
>>> > >
>>> > >
>>> > > If you can only generate the nonzero allocation sequentially you can
>>> only solve sequentially which means your matrix is MATSEQAIJ and your
>>> vector is VECSEQ and your communicator is PETSC_COMM_SELF.
>>> > >
>>> > > If you pass and array for nnz, what you pass for nz is irrelevant,
>>> you might as well pass 0.
>>> > >
>>> > > Barry
>>> > >
>>> > >
>>> > > > On Apr 18, 2018, at 10:48 AM, Klaus Burkart wrote:
>>> > > >
>>> > > > More questions about matrix and vector type selection for my
>>> application:
>>> > > >
>>> > > > My starting point is a huge sparse matrix which can be symmetric
>>> or asymmetric and a rhs vector. There's no defined local or block structure
>>> at all, just row and column indices and the values and an array style rhs
>>> vector together describing the entire linear system to be solved. With
>>> quite some effort, I should be able to create an array nnz[N] containing
>>> the number of nonzeros per row in the global matrix for memory allocation
>>> which would leave me with MatSeqAIJSetPreallocation(M, 0, nnz); as the only
>>> option for efficient memory allocation ie. a MATSEQAIJ matrix and VECSEQ. I
>>> assume here, that 0 indicates different numbers of nonzero values in each
>>> row, the exact number being stored in the nnz array. Regarding this detail
>>> but one example assume a constant number of nz per row so I am not sure
>>> whether I should write 0 or NULL for nz?
>>> > > >
>>> > > > I started with:
>>> > > >
>>> > > > MatCreate(PETSC_COMM_WORLD, &M);
>>> > > > MatSetSizes(M, PETSC_DECIDE, PETSC_DECIDE, N, N);
>>> > > > MatSetFromOptions(M);
>>> > > >
>>> > > > taken from a paper and assume, the latter would set the matrix
>>> type to MATSEQAIJ which might conflict with PETSC_COMM_WORLD. Maybe
>>> decompositioning took place at an earlier stage and the authors of the
>>> paper were able to retrieve the local data and structure.
>>> > > >
>>> > > > What type of matrix and vector should I use for my application
>>> e.g. MATSEQAIJ and VECSEQ to be able to use MatSeqAIJSetPreallocation(M, 0,
>>> nnz); for efficient memory allocation?
>>> > > >
>>>
>>> > > > In this case, where would the decompositioning / MPI process
>>> allocation take place?
>>> > >
>>>
>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://www.cse.buffalo.edu/~knepley/
>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.caam.rice.edu_-7Emk51_&d=DwMFaQ&c=54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00&r=kuHHom1yjd94zUrBWecnYg&m=N7Qj9HPi7_sDdfpeUPyFGoLWR95huadKQYfwt6HFZTg&s=X1XTDkxAzSG3ajhCqaTWt8j4MtmpP7m1h8PAUX0xslA&e=>
>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.caam.rice.edu_-7Emk51_&d=DwMFaQ&c=54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00&r=kuHHom1yjd94zUrBWecnYg&m=N7Qj9HPi7_sDdfpeUPyFGoLWR95huadKQYfwt6HFZTg&s=X1XTDkxAzSG3ajhCqaTWt8j4MtmpP7m1h8PAUX0xslA&e=>
>>
>
>
--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
https://www.cse.buffalo.edu/~knepley/ <http://www.caam.rice.edu/~mk51/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20180420/e9d2b301/attachment-0001.html>
More information about the petsc-users
mailing list