<div dir="ltr">Massoud,<div class="gmail_extra"><br><div class="gmail_quote">On 28 November 2016 at 20:18, Massoud Rezavand <span dir="ltr"><<a href="mailto:Massoud.Rezavand@uibk.ac.at" target="_blank">Massoud.Rezavand@uibk.ac.at</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br>
<br>
Thanks.<br>
<br>
As you know, in SPH method, the calculations are done over the neighboring particles (j) that fall inside a support domain defined by a circle over the particle of interest (i). Since the Lagrangian nature the method, the number of neighboring particles are varying slightly over time, e.g. in a 2D domain this number is varying between 43 to 51 (in my experience).<br>
<br>
The number of nonzeros per row (A_ij) is equal to the number of neighboring particles and normally is not fixed over time, therefore, we put the elements dynamically at each time step and we have to calculate d_nz and o_nz at each time iteration.<br>
<br>
In order to preallocate the matrix, another way would be to calculate the number of neighboring particles and set that as the number of nonzeros per row. Doing so, do you recommend to use :<br>
<br>
MatMPIAIJSetPreallocation()<br>
<br>
to preallocate A to achieve the best performance?<br></blockquote><div><br></div><div>The other thing to bear in mind is that PETSc objects like Mat and Vec are not really dynamic with respect to the partitioning.<br><br></div><div>In you SPH simulation, at each time step not only does the number of non-zeros (e.g. nearest neighbours) change, but likely so to will the number of particles per sub-domain (depending on how you define a sub-domain - see footnote below). Once you create a Mat and Vec object, the partition is defined once and for all and cannot be altered. Hence, when you particles cross sub-domain boundaries, you will have to destroy the matrix and re-create the non-zero structure and re-do the preallocation. The good news is the setup time for a new mat and vec in petsc is fast so I doubt you'll notice much overhead of the create/destroy being performed at each time step.<br><br></div><div>Thanks,<br></div><div> Dave<br></div><div><br>(*) My comment kind of assumes that since you are modelling incompressible fluids, you have a constant smoothing length and will partition the domain via boxes of size 2h and a sub-domain wrt to the particles will be defined via all the points live in a set of boxes mapped to a given MPI-rank. PPM probably has some clever load balancing strategy, but nevertheless I think you'll run into this issue with Mat and Vec.<br></div><div><br> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
<br>
Regards,<br>
<br>
Massoud<div class="m_4261903346678003517HOEnZb"><div class="m_4261903346678003517h5"><br>
<br>
<br>
On 11/28/2016 06:36 PM, Barry Smith wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
On Nov 28, 2016, at 10:30 AM, Massoud Rezavand <<a href="mailto:Massoud.Rezavand@uibk.ac.at" target="_blank">Massoud.Rezavand@uibk.ac.at</a>> wrote:<br>
<br>
Dear Barry,<br>
<br>
You recommended me to directly use MatSetValues() and not to put the matrix in a parallel CSR matrix.<br>
<br>
In order to count the d_nz and o_nz I have to put the entries into a sequential CSR matrix<br>
</blockquote>
If you don't know the number of nonzeros per row how are you going to put the values into a sequential CSR format?<br>
On the other hand if you can figure out the number of nonzeros per row without creating the matrix how come you cannot figure out the d_nz and o_nz?<br>
<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
and then do the MatMPIAIJSetPreallocation() and then do the MatSet Values().<br>
</blockquote>
If you do put the values into a sequential CSR format, which it is not clear to me is needed, then you can just call<br>
MatCreateMPIAIJWithArrays() and skip the "MatMPIAIJSetPreallocation() and then do the MatSet Values()"<br>
<br>
Barry<br>
<br>
<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Does it effect the performance ?<br>
<br>
<br>
Regards,<br>
<br>
Massoud<br>
<br>
On 11/21/2016 08:10 PM, Barry Smith wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
On Nov 21, 2016, at 12:20 PM, Massoud Rezavand <<a href="mailto:Massoud.Rezavand@uibk.ac.at" target="_blank">Massoud.Rezavand@uibk.ac.at</a>> wrote:<br>
<br>
Thank you very much.<br>
<br>
Yes I am developing the new 3D version in Parallel with the PPM (the new generation OpenFPM, not released yet) library which generates the particles and decomposes the domain.<br>
<br>
I don't have the parallel matrix generation yet. In the old version I had CSR format and a vector of knowns (b).<br>
So, should I use MatSetValuesStencil() ?<br>
</blockquote>
MatSetValuesStencil is for finite differences on a structured grid. I don't think it makes sense for your application.<br>
<br>
You need to use MatMPIAIJSetPreallocation() and then MatSetValues() to put the entries in.<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
What do you recommend for creating the vector of knowns (b)?<br>
</blockquote>
Just use VecCreateMPI()<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
On the other hand, due to the convergence issues for millions of particles in ISPH, I have to use a preconditioner. In a paper I saw they have used BoomerAMG from HYPRE. Do you have any recommendation?<br>
</blockquote>
We have many to try, it is not clear that any would be particularly good for SPH. Certainly try BoomerAMG<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I saw an example ( ex19.c) using BoomerAMG. Should I follow that?<br>
<br>
<br>
PS: regarding the unbalance sparsity in SPH, yes in contrast to the mesh-based methods, the A matrix in ISPH is changing over the time but the number of non-zeros is defined by the number of neighboring particles which in most cases is constant.<br>
<br>
Cheers,<br>
<br>
Massoud<br>
<br>
<br>
<br>
On 11/21/2016 06:18 PM, Barry Smith wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
On Nov 21, 2016, at 10:33 AM, Massoud Rezavand <<a href="mailto:Massoud.Rezavand@uibk.ac.at" target="_blank">Massoud.Rezavand@uibk.ac.at</a>><br>
wrote:<br>
<br>
Dear all,<br>
<br>
I am going to use PETSc in an Incompressible SPH code to solve the pressure Poisson equation as a linear system of equations.<br>
<br>
In my old sequential code I used the PCG method or the BiCGSTAB with jacobi preconditioner.<br>
I used to store the coefficient matrix (A) in CSR (AIJ) format and solve it.<br>
<br>
My question is that how should I import the CSR metrix and the known vector (b) into the library to be solved? Is there an example to show how to import and external system of eqs. into PETSc?<br>
<br>
</blockquote>
For sequential code it is straightforward.<br>
<br>
If you already have the matrix in CSR format you can call MatCreateSeqAIJWithArrays() to create the PETSc matrix without copying the data. You can use VecCreateSeqWithArray() to provide the vector. Or you can use VecPlaceArray() to use the array of vector values you provide.<br>
<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
In my case, the computational domain is decomposed by another library, so does it effect the performance of PETSc?<br>
<br>
</blockquote>
I read this to mean you want the new code to be parallel (but the old one is sequential)?<br>
<br>
If you don't currently have matrix generation in parallel I urge you strongly to directly use MatSetValues() to generate your matrix, do not first put the matrix entries into some kind of parallel CSR format. If you already have the matrix in "parallel" CSR format you can use MatCreateMPIAIJWithArrays() to copy the matrix over to CSR format.<br>
<br>
It is my understanding that SPH can produce matrices with very unbalance sparsity. It is important to take this into account if you wish to run in parallel since if you end up with some processes having many more nonzeros than other processes you will get very poor performance.<br>
<br>
<br>
Barry<br>
<br>
<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
With the best regards,<br>
Massoud<br>
<br>
<br>
</blockquote></blockquote></blockquote></blockquote></blockquote></blockquote>
<br>
</div></div></blockquote></div><br></div></div>