[petsc-users] KSP ex2.c and Parallelization

Fri Feb 15 17:32:03 CST 2019

   Unfortunately converting a sequential code to MPI parallelism requires a great deal of restructuring of the code (often little of the original code can be retained) to get good performance. This is because the data has to be partitioned well in parallel to get good performance. src/ksp/ksp/examples/tutorials/ex2.c is not a good example because it doesn't make an effort to partition the data for good parallel performance.

   The PETSc DMDACreate2d() is designed to handle the partitioning of the data and generation of the matrix effectively in parallel. Take a look at, for example, src/ksp/ksp/examples/tutorials/ex46.c 

  Barry

> On Feb 15, 2019, at 5:03 PM, Maahi Talukder <maahi.buet at gmail.com> wrote:
> 
> 
> Thank you so much for your reply.
> 
>  I am trying to solve  the inverse of Poisson's equation to generate structured grid. The discretization is the 9-point finite difference.  The elements of my 'A' matrix are being calculated as it follows and being stored in Q matrix. I am planning to pass the values of Q() matrix to PETSc Mat M to solve my linear system. And I am trying to solve it in parallel, as my sequential code is generating results perfectly. So to implement in parallel, I was following the ex2.c . So what do you think about my approch? 
> 
> Do i = 2,ymax-1
> 
> Do j = 2,xmax-1
> 
> g11(i-1,j-1) = 0.25*(x(i,j+1)-x(i,j-1))*(x(i,j+1)-x(i,j-1)) +0.25*(y(i,j+1)-y(i,j-1))*(y(i,j+1)-y(i,j-1))
> g12(i-1,j-1) = 0.25*(x(i,j+1)-x(i,j-1))*(x(i+1,j)-x(i-1,j))+ 0.25*(y(i,j+1)-y(i,j-1))*(y(i+1,j)-y(i-1,j))
> g22(i-1,j-1) = 0.25*(x(i+1,j)-x(i-1,j))*(x(i+1,j)-x(i-1,j)) + 0.25*(y(i+1,j)-y(i-1,j))*(y(i+1,j)-y(i-1,j))
> 
> M(j-1,j-1+xmax*(i-2))= (-1)*(g12(i-1,j-1)/2)
> M(j-1,j+xmax*(i-2))= g11(i-1,j-1)
> M(j-1,j+1+xmax*(i-2)) = g12(i-1,j-1)/2
> M(j-1,j-1+xmax*(i-2)+xmax)= g22(i-1,j-1)
> M(j-1,j+xmax*(i-2)+xmax)= (-2)*(g22(i-1,j-1)+g11(i-1,j-1))
> M(j-1,j+1+xmax*(i-2)+xmax)= g22(i-1,j-1)
> M(j-1,j-1+xmax*(i-2)+2*xmax) = g12(i-1,j-1)/2
> M(j-1,j+xmax*(i-2)+2*xmax) = g11(i-1,j-1)
> M(j-1,j+1+xmax*(i-2)+2*xmax) = (-1)*(g12(i-1,j-1)/2)       
> 
> end Do
> 
> N(1+(xmax-2)*(i-2):(xmax-2)*(i-1),1:xmax*ymax)= M(1:xmax-2,1:xmax*ymax)
> M = 0
> 
> end Do
> 
> E(1:(xmax-2)*(ymax-2),1:xmax*ymax-2*xmax) = N(1:(xmax-2)*(ymax-2),xmax+1:xmax*ymax-xmax)
> 
> Do i = 1,ymax-2
>     
> Q(1:(xmax-2)*(ymax-2),1+(xmax-2)*(i-1):(xmax-2)*i)= E(1:(xmax-2)*(ymax-2),2+xmax*(i-1):(xmax-1)+xmax*(i-1))
>     
> end Do
> 
> Thanks,
> Maahi Talukder
> 
> On Fri, Feb 15, 2019 at 5:47 PM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
> 
>    Maybe we could provide more guidance on how to proceed if we knew a bit more about the 1) problem you are trying to solve, 2) the discretization you are using and 3) the mesh you are using?
> 
>     From your code fragment with a m*n I am guessing you are solving something on a structured grid with m grid points in the x direction and n grid points in the y direction? 
> 
>   Barry
> 
> 
> > On Feb 15, 2019, at 4:26 PM, Maahi Talukder via petsc-users <petsc-users at mcs.anl.gov> wrote:
> > 
> > Thank you so much for your reply. 
> >  
> > But in the definition of d_nz of MatMPIAIJSetPreallocation  , it is defined as 'number of nonzeros per row in DIAGONAL portion of local submatrix'. So I was wondering about how to find the Diagonal portion of the local submatrix that PETSc is splitting the big matrix into . So should it be the maximum number of nonzeros in each row of the big matrix instead of  the min number of nonzeros in the diagonal portion of the local matrix?
> > 
> > And so MatGetOwnershipRange returns the start and the end of all the smaller matrices that PETSc is splitting the big matrix into, right?
> > 
> > 
> > Regards,
> > Maahi Talukder
> > 
> > On Fri, Feb 15, 2019 at 2:43 PM Zhang, Hong <hzhang at mcs.anl.gov> wrote:
> > Maahi :
> > Hello All,
> > 
> > I have some questions regarding ex2.c in /petsc/src/ksp/ksp/examples/tutorials/ex2.c. 
> > 
> > 1.   The  local sizes of the matrix was decided by PETSc using ierr = MatSetSizes(A,PETSC_DECIDE,PETSC_DECIDE,m*n,m*n);CHKERRQ(ierr);
> > But without knowing the local sizes, how was the following function used which requires the knowledge of how the big matrix was split into smaller chunks within PETSc? 
> >   ierr = MatMPIAIJSetPreallocation(A,5,NULL,5,NULL);CHKERRQ(ierr);
> > This example sets a matrix for 2D 5-point PDE stencil, for which the maximum non-zeros in each row is 5. 
> > 
> > 2.  When using  MatGetOwnershipRange(A,&Istart,&Iend);CHKERRQ(ierr),  the global index of  first and last row of which local chunk of the big matrix is it returning ? I mean if the big matrix was split into three processes, for example, fist and last global row of which process is it returning? 
> > Adding a line 
> >   ierr = MatGetOwnershipRange(A,&Istart,&Iend);CHKERRQ(ierr);
> > +  printf("Istart %d Iend %d\n",Istart,Iend); 
> > 
> > I get
> > $ mpiexec -n 3 ./ex2
> > Istart 0 Iend 19
> > Istart 19 Iend 38
> > Istart 38 Iend 56
> > Norm of error 0.000522061 iterations 9
> > The matrix global size = 56
> > 
> > Hong
> > 
> > Your replies are highly appreciated
> > 
> > Thanks,
> > Maahi Talukder
> > Clarkson University 
> > 
> > 
>