[petsc-users] PETSC matrix assembling super slow

Mark Adams mfadams at lbl.gov
Tue Jan 29 21:02:45 CST 2019


Optimized is a configuration flag not a versions.

You need to figure out your number of non-zeros per row of you global
matrix, or a bound on it, and supply that in MatMPIAIJSetPreallocation.
Otherwise it has to allocate and copy memory often.

You could increase your f9 on a serial run and see what runs best and then
move to parallel with a value in f6 of about 1/2 of f9.

On Tue, Jan 29, 2019 at 9:13 PM Yaxiong Chen <chen2018 at purdue.edu> wrote:

> Thanks Mark,
>
>     I use PETSC 3.9.4, is this the optimized version you called?
>
>     Actually f9 and f6 are from the PETSC example. I don't know how to set
> the value correctly so I commend them. The size of my elemental matrix may
> vary. For 2D problem, the size of elemental matrix can be 24*24 or 32*32 or
> some other sizes. And the index is not continuous. In this case, the
> elemental matrix may interlace with each other in the global matrix, and I
> may have thousands of elemental matrix to be assembled. Does the
> preallocating suitable for this?
>
>
> Yaxiong Chen,
> Ph.D. Student
>
> School of Mechanical Engineering, 3171
>
> 585 Purdue Mall
>
> West Lafayette, IN 47907
>
>
>
>
>
> ------------------------------
> *From:* Mark Adams <mfadams at lbl.gov>
> *Sent:* Tuesday, January 29, 2019 8:25 PM
> *To:* Yaxiong Chen
> *Cc:* Song Gao; petsc-users at mcs.anl.gov
> *Subject:* Re: [petsc-users] PETSC matrix assembling super slow
>
> Slow assembly is often from not preallocating correctly. I am guessing
> that you are using Q1 element and f9==9, and thus the preallocation should
> be OK if this is a scalar problem on a regular grid and f6-==6 should be OK
> for the off processor allocation, if my assumptions are correct.
>
> You can run with -info, which will tell you how many allocation were done
> in assembly. Make sure that it is small (eg, 0).
>
> I see you use f90 array stuff 'idx-1'. Compilers can sometimes do crazy
> things with seeming simple code. You could just do this manually if you can
> find anything else.
>
> And I trust you are running an optimized version of PETSc.
>
>
> On Tue, Jan 29, 2019 at 6:22 PM Yaxiong Chen via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
> Hi Song,
>
> I don't quite understand how I can use this command. I don't partition the
> gloabl matrix. If I add my elemental matrix to the global system it will be
> like this. And in my parallel part, I use each core to generate the
> elemental matrix in turn. In this case, I guess each core will be assigned
> the space for global matrix and finally be assembled.But according to the
> manual, it seems each core will store a part of the global matrix. Is the local
> submatrix in the MatMPIAIJSetPreallocation
> <https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMPIAIJSetPreallocation.html#MatMPIAIJSetPreallocation>
> (Mat
> <https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/Mat.html#Mat>
> B,PetscInt
> <https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscInt.html#PetscInt>
> d_nz,const PetscInt
> <https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscInt.html#PetscInt>
> d_nnz[],PetscInt
> <https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscInt.html#PetscInt>
> o_nz,const PetscInt
> <https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscInt.html#PetscInt>
> o_nnz[])the same as my elemental matrix?
>
>
>
> Thanks
>
>
> Yaxiong Chen,
> Ph.D. Student
>
> School of Mechanical Engineering, 3171
>
> 585 Purdue Mall
>
> West Lafayette, IN 47907
>
>
>
>
>
> ------------------------------
> *From:* Song Gao <song.gao2 at mail.mcgill.ca>
> *Sent:* Tuesday, January 29, 2019 1:22 PM
> *To:* Yaxiong Chen
> *Cc:* Matthew Knepley; petsc-users at mcs.anl.gov
> *Subject:* Re: [petsc-users] PETSC matrix assembling super slow
>
> I think you would prefer to preallocate the matrix
>
> uncomment this line
> ! call
> MatMPIAIJSetPreallocation(Amat,f9,PETSC_NULL_INTEGER,f6,PETSC_NULL_INTEGER,
> ierr)
>
>
>
> Le mar. 29 janv. 2019, à 12 h 40, Yaxiong Chen via petsc-users <
> petsc-users at mcs.anl.gov> a écrit :
>
> Hello,
>
>
> I have a 2D system which is assembled by each elemental matrix.  Ae is my
> elemental matrix, auxRHSe(:) and RHSe(:) and corresponding right hand side,
> idx is the global index. My code is as follow, however ,the assembling rate
> is super slow(Marked red in the code). I am not sure whether the assembling
> type is right or not. Since for each element, idx are not continuous
> numbers. Do you have any idea what is the better way to assemble the matrix?
>
>
> Thanks
>
>
> block
>     PetscErrorCode ierr
>      PetscMPIInt rank,nproc, mystart
>      PetscInt  nelem
>      integer,allocatable  ::info(:)
>      real(wp), allocatable :: Ae(:,:), auxRHSe(:),RHSe(:)
>      integer, allocatable      :: idx(:)
>      PetscScalar, pointer :: xx_v(:)
>
>      PC               prec
>      PetscScalar ::   val
>      Vec              xvec,bvec,uvec
>      Mat              Amat
>      KSP              ksp
>      PetscViewer viewer
>      PetscInt  geq,j,k,ne,M,Istart,Iend
>      PetscBool  flg
>      KSPConvergedReason reason
>      Vec         dummyVec, dummyVecs(1)
>      MatNullSpace nullspace
>
>      call PetscInitialize( PETSC_NULL_CHARACTER, ierr )
>
>      if (ierr .ne. 0) then
>      print*,'Unable to initialize PETSc'
>      stop
>      endif
>      call MPI_Comm_size(PETSC_COMM_WORLD,nproc,ierr)
>      call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr)
>      mystart=rank+1
>
>  !   Parameter set
>      info=ptsystem%getInitInfo()
>      nelem =
> info(Info_EleMatNum)+info(Info_FixedDOFNum)+info(Info_NumConstrain)
>      print*,'nelem',nelem
>  !-------------------------------------
>  !  Create Matrix
>      call MatCreate(PETSC_COMM_WORLD,Amat,ierr)
>      call MatSetSizes( Amat,PETSC_DECIDE, PETSC_DECIDE, info(1), info(1),
> ierr )
>      call MatSetType( Amat, MATMPIBAIJ, ierr )
>      ! call
> MatMPIAIJSetPreallocation(Amat,f9,PETSC_NULL_INTEGER,f6,PETSC_NULL_INTEGER,
> ierr)
>      call MatSetFromOptions( Amat, ierr )
>      call MatSetUp( Amat, ierr )
>      call MatGetOwnershipRange( Amat, Istart, Iend, ierr )
>
>       xvec = tVec(0)
>       call MatCreateVecs( Amat, PETSC_NULL_VEC, xvec, ierr )
>       call VecSetFromOptions( xvec, ierr )
>       call VecDuplicate( xvec, bvec, ierr )
>       call VecDuplicate( xvec, uvec, ierr )
>
>    t1 = MPI_WTime();
>
>       do i=mystart,nelem,nproc
>             call ptSystem%getElementalMAT(i, Ae, auxRHSe, idx)
>             ne=size(idx)
>             if (allocated(auxRHSe))  call
> VecSetValues(bvec,ne,idx-1,auxRHSe,ADD_VALUES,ierr)
>              call MatSetValues(Amat,ne,idx-1,ne,idx-1,Ae,ADD_VALUES,ierr)
>       end do
>
>       nelem   = info(Info_EleRHSNum)
>       mystart = rank+1
>
>       do i = mystart, nelem, nproc
>         call ptSystem%getElementalRHS(i, RHSe, idx)
>         print*,'idx',idx
>         ne=size(idx)
>         if (allocated(RHSe))  call
> VecSetValues(bvec,ne,idx-1,RHSe,ADD_VALUES,ierr)
>       end do
>       call MatAssemblyBegin(Amat,MAT_FINAL_ASSEMBLY,ierr)
>       call MatAssemblyEnd(Amat,MAT_FINAL_ASSEMBLY,ierr)
>       !  this part is slow, the for loop above is done but here it may get
> stuck
>       call VecAssemblyBegin(bvec,ierr)
>                                 ! For a 2500 DOF system, assembling only
> takes over 2 seconds
>       call VecAssemblyEnd(bvec,ierr)
>                                 ! But for a 10000 DOF system , it gets stuck
>      t2 = MPI_WTime();
>       print*,'assembeling time',t2-t1
>       ! Solve
>       call KSPCreate(PETSC_COMM_WORLD,ksp,ierr)
> !  Set operators. Here the matrix that defines the linear system
> !  also serves as the preconditioning matrix.
>       call KSPSetOperators(ksp,Amat,Amat,ierr)
>       call KSPSetFromOptions(ksp,ierr)
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> !                      Solve the linear system
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>     call KSPSetType(ksp,KSPCG,ierr)
>     call KSPGetPC(ksp,prec,ierr)
>    ! call KSPSetPCSide(ksp,PC_SYMMETRIC,ierr)
>     call PCSetType(prec,PCJACOBI,ierr)
>      call KSPSolve(ksp,bvec,xvec,ierr)
>      call PetscFinalize(ierr)
>
> end block
>
>
> Yaxiong Chen,
> Ph.D. Student
>
> School of Mechanical Engineering, 3171
>
> 585 Purdue Mall
>
> West Lafayette, IN 47907
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190129/077f2f4b/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pastedImage.png
Type: image/png
Size: 3840 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190129/077f2f4b/attachment-0001.png>


More information about the petsc-users mailing list