[petsc-users] Assembling a finite element matrix in parallel

Sun Apr 27 21:44:12 CDT 2014

   You can also just run with -pc_type redundant -redundant_pc_type lu .  It will use sequential LU on the entire matrix to solve the linear systems but the rest of the code will be parallel.

  Barry

On Apr 27, 2014, at 7:48 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Sun, Apr 27, 2014 at 7:22 PM, Justin Dong <jsd1 at rice.edu> wrote:
> Thanks for your help -- the matrix assembly is working correctly now.
> 
> However, say I want to solve the linear problem in serial now. I have A_global and b_global created using MatCreateAIJ and VecCreateMPI, but want to solve A_global*x = b_global using PCLU. I have:
> 
> KSPCreate(PETSC_COMM_SELF, &ksp);
> KSPSetOperators(ksp, A_global, A_global, DIFFERENT_NONZERO_PATTERN);
> KSPGetPC(ksp, &pc);
> 
> PCSetType(pc, PCLU);
> 
> KSPSetTolerances(ksp, 1.e-13, PETSC_DEFAULT, PETSC_DEFAULT, PETSC_DEFAULT);
> 
> KSPSolve(ksp, b_global, b_global);
> 
> 
> Is there anyway to get this to work? I'd prefer using PCLU rightnow, and besides, the speed-up from assembling in parallel is already significant.
> 
> I recommend solving in parallel (on the same communicator as the matrix). This is not too hard:
> 
>   1) Reconfigure using $PETSC_DIR/$PETSC_ARCH/conf/reconfigure-$PETSC_ARCH.py --download-superlu_dist
> 
>   2) Call KSPSetFromOptions(ksp) before solve
> 
>   3) Run with -pc_type lu -pc_factor_mat_solver_package superlu_dist
> 
> You can easily get your solution on one process if you want using http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecScatterCreateToZero.html
> 
>   Thanks,
> 
>      Matt
>  
> 
> On Sun, Apr 27, 2014 at 9:11 AM, Matthew Knepley <knepley at gmail.com> wrote:
> On Sun, Apr 27, 2014 at 9:05 AM, Justin Dong <jsd1 at rice.edu> wrote:
> When I try to invoke MatCreateMPIAIJ, I'm getting this warning:
> 
> main.c:1842:2: warning: implicit declaration of function 'MatCreateMPIAIJ' is invalid in C99 [-Wimplicit-function-declaration]
> 
>         MatCreateMPIAIJ(PETSC_COMM_SELF, NLoc, NLoc, NLoc*nElems, NLoc*nElems, (ref+3)*NLoc, 
> 
>         ^
> 
> and this error:
> 
> The name was changed in the last release:
> 
>   http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatCreateAIJ.html
> 
>   Thanks,
> 
>       Matt
>  
> Undefined symbols for architecture x86_64:
> 
>   "_MatCreateMPIAIJ", referenced from:
> 
>       _DG_SolveOil in main.o
> 
> ld: symbol(s) not found for architecture x86_64
> 
> clang: error: linker command failed with exit code 1 (use -v to see invocation)
> 
> make: [main] Error 1 (ignored)
> 
> 
> /bin/rm -f -f main.o
> 
> 
> 
> This is how I'm calling MatCreateMPIAIJ:
> 
> MatCreateMPIAIJ(PETSC_COMM_WORLD, nlocal, nlocal, nlocal*nelements, nlocal*nelements, 3*nlocal, 
> 
> 				    PETSC_NULL, 3*nlocal, PETSC_NULL, &A_global);
> 
> 
> 
> I'm able to run some simple MPI functions and use mpiexec, so I don't think it's an issue with MPI itself.
> 
> Sincerely,
> Justin
> 
> 
> 
> On Sun, Apr 27, 2014 at 8:35 AM, Matthew Knepley <knepley at gmail.com> wrote:
> On Sun, Apr 27, 2014 at 7:47 AM, Justin Dong <jsd1 at rice.edu> wrote:
> I think I've got the right idea on how to divide up the for loop:
> 
>     MPI_Comm_size(MPI_COMM_WORLD,&numprocs); 
>     MPI_Comm_rank(MPI_COMM_WORLD,&myid); 
> 
>     mystart = (nelements / numprocs) * myid;
>     if (nelements % numprocs > myid)
>     {
>     	mystart += myid;
>     	myend = mystart + (nelements / numprocs) + 1;
>     }
>     else
>     {
>     	mystart += nelements % numprocs;
>     	myend = mystart + (nelements / numprocs);
>     }
> 
> We have a function that does exactly this: http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/Sys/PetscSplitOwnership.html
>  
> But then, do I do this?
> 
> Yes.
>  
> for (k = mystart+1; k < myend; ++k)
> {
>     get_index(k,ie,je); /* ie and je are arrays that indicate where to place A_local */
>     compute_local_matrix(A_local,...);
> 
>     MatSetValues(A_global, nlocal, ie, nlocal, je, A_local, ADD_VALUES);
> }
> 
> The indices are global indices and I'm doing 
> 
> MatCreateSeqAIJ(PETSC_COMM_WORLD, nglobal, nglobal, 3*nlocal, 
> 				    PETSC_NULL, &A_global);
> 
> This should be MatCreateMPIAIJ() for parallel execution.
> 
>    Matt
> 
> to create the matrix. Running the program seems to give me multiple errors, but mainly
> [3]PETSC ERROR: Object is in wrong state!
> 
> [3]PETSC ERROR: Mat object's type is not set: Argument # 1!
> 
> 
> 
> 
> 
> 
> On Sun, Apr 27, 2014 at 6:54 AM, Matthew Knepley <knepley at gmail.com> wrote:
> On Sun, Apr 27, 2014 at 6:25 AM, Justin Dong <jsd1 at rice.edu> wrote:
> Hi Matt,
> 
> For option 1), that would be using MPI and not any special functions in PETSc? I ask since I've never done parallel programming before. I tried using OpenMP to parallelize that for loop but it resulted in some errors and I didn't investigate further, but I'm assuming it's because each process wasn't communicating properly with the others during MatSetValues?
> 
> Yes, using MPI, so each process owns a set of elements that it loops over. The Mat object manages the global
> values as long as you use global indices for the (row, column). There are of course many refinements for this,
> but I think the thing to do is get something working fast, and then optimize it.
> 
>   Thanks,
> 
>      Matt
>  
> Sincerely,
> Justin
> 
> 
> On Sun, Apr 27, 2014 at 5:57 AM, Matthew Knepley <knepley at gmail.com> wrote:
> On Sun, Apr 27, 2014 at 4:14 AM, Justin Dong <jsd1 at rice.edu> wrote:
> Hi all,
> 
> I'm currently coding a finite element problem in PETSc. I'm computing all of the matrices by myself and would prefer to do it that way. I want to parallelize the assembly of the global matrices. This is a simplified version of the assembly routine (pseudocode):
> 
> for (k = 0; k < nelements; ++k)
> {
>     get_index(k,ie,je); /* ie and je are arrays that indicate where to place A_local */
>     compute_local_matrix(A_local,...);
> 
>     MatSetValues(A_global, nlocal, ie, nlocal, je, A_local, ADD_VALUES);
> }
> 
> This is for DG finite elements and I know the matrix easily be assembled in parallel. Even on my laptop, this would allow me to significantly speed up my code. The only problem is, I'm not sure how to do this in PETSc. I'd assume I need to use MatCreateMPIAIJ instead of MatCreateSeqAIJ, but wasn't able to find any using tutorials on how I might do this. 
> 
> 1) If you just split this loop across processes, it would work immediately. However, that can be non-optimal in terms
>      of communication.
> 
> 2) PETSc matrices are distributed by contiguous blocks of rows (see manual section on matrices), so you would like
>     those row blocks to correspond roughly to your element blocks.
> 
> 3) You will also have to preallocate, but that should be the same as you do now for the sequential case, except you
>      check whether a column is inside the diagonal block.
> 
>   Thanks,
> 
>       Matt
>  
> Sincerely,
> Justin
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener