[petsc-users] PETSC matrix assembling super slow
Mark Adams
mfadams at lbl.gov
Tue Feb 5 14:22:03 CST 2019
On Mon, Feb 4, 2019 at 4:17 PM Yaxiong Chen <chen2018 at purdue.edu> wrote:
> Hi Mark,
>
> Will the parameter MatMPIAIJSetPreallocation in influence the
> following part
> do i=mystart,nelem,nproc
> call ptSystem%getElementalMAT(i, Ae, auxRHSe, idx)
> ne=size(idx)
> idx=idx-1 !-1 since PETSC index starts from zero
> if (allocated(auxRHSe)) call
> VecSetValues(bvec,ne,idx,auxRHSe,ADD_VALUES,ierr)
> call MatSetValues(Amat,ne,idx,ne,idx,Ae,ADD_VALUES,ierr)
> end do
>
> I found this part will impede my assembling process. In my case ,the total
> DOF is 20800. I estimated the upper bound of number of nonzero entries in
> each row as 594.
>
Then you want to set f9 to 594
> So I set f9 to be 20206 and f6 to be 10103 in the following command:
>
call
> MatMPIAIJSetPreallocation(Amat,f9,PETSC_NULL_INTEGER,f6,PETSC_NULL_INTEGER,
> ierr)
>
>
> Running the code in sequential mode with -info I got the following
> information.And I can get the result even thought the assembling process is
> kind of slow(several times slower than using Mumps).
>
> [0] MatAssemblyBegin_MPIBAIJ(): Stash has 0 entries,uses 0 mallocs.
>
> [0] MatAssemblyBegin_MPIBAIJ(): Block-Stash has 0 entries, uses 0 mallocs.
>
> [0] MatAssemblyEnd_SeqBAIJ(): Matrix size: 20800 X 20800, block size 1;
> storage space: 13847 unneeded, 1030038 used
>
> [0] MatAssemblyEnd_SeqBAIJ(): Number of mallocs during MatSetValues is
> 62659
>
> [0] MatAssemblyEnd_SeqBAIJ(): Most nonzeros blocks in any row is 70
>
> [0] MatCheckCompressedRow(): Found the ratio (num_zerorows
> 0)/(num_localrows 20800) < 0.6. Do not use CompressedRow routines.
>
> [0] PetscCommDuplicate(): Using internal PETSc communicator 4462889472
> 140509341618528
>
> [0] PetscCommDuplicate(): Using internal PETSc communicator 4462889472
> 140509341618528
>
> [0] PetscCommDuplicate(): Using internal PETSc communicator 4462889472
> 140509341618528
>
> [0] VecScatterCreate_Seq(): Special case: sequential copy
>
> [0] MatAssemblyEnd_SeqBAIJ(): Matrix size: 20800 X 0, block size 1;
> storage space: 104000 unneeded, 0 used
>
This looks like your matrix was create with 0 columns.
> [0] MatAssemblyEnd_SeqBAIJ(): Number of mallocs during MatSetValues is 0
>
> [0] MatAssemblyEnd_SeqBAIJ(): Most nonzeros blocks in any row is 0
>
> [0] MatCheckCompressedRow(): Found the ratio (num_zerorows
> 20800)/(num_localrows 20800) > 0.6. Use CompressedRow routines.
>
> aseembel time 282.54871800000001
>
> [0] PetscCommDuplicate(): Using internal PETSc communicator 4462888960
> 140509341442256
>
> [0] PCSetUp(): Setting up PC for first time
>
> [0] PCSetUp(): Leaving PC with identical preconditioner since operator is
> unchanged
>
> [0] PCSetUp(): Leaving PC with identical preconditioner since operator is
> unchanged
>
> [0] PCSetUp(): Leaving PC with identical preconditioner since operator is
> unchanged
>
> [0] PCSetUp(): Leaving PC with identical preconditioner since operator is
> unchanged
>
> [0] PCSetUp(): Leaving PC with identical preconditioner since operator is
> unchanged
>
> [0] PCSetUp(): Leaving PC with identical preconditioner since operator is
> unchanged
>
> [0] PCSetUp(): Leaving PC with identical preconditioner since operator is
> unchanged
>
> However, when I use the parallel mode,I got the following information:
>
> [0] MatAssemblyBegin_MPIBAIJ(): Stash has 707965 entries,uses 6 mallocs.
>
> [0] MatAssemblyBegin_MPIBAIJ(): Block-Stash has 0 entries, uses 0 mallocs.
>
> it seems it never went to call
> MatAssemblyEnd(Amat,MAT_FINAL_ASSEMBLY,ierr)
>
> Is there anything I am doing wrong?
>
> Thanks
>
>
> Yaxiong Chen,
> Ph.D. Student
>
> School of Mechanical Engineering, 3171
>
> 585 Purdue Mall
>
> West Lafayette, IN 47907
>
>
>
>
>
> ------------------------------
> *From:* Mark Adams <mfadams at lbl.gov>
> *Sent:* Tuesday, January 29, 2019 10:02 PM
> *To:* Yaxiong Chen; PETSc users list
> *Subject:* Re: [petsc-users] PETSC matrix assembling super slow
>
> Optimized is a configuration flag not a versions.
>
> You need to figure out your number of non-zeros per row of you global
> matrix, or a bound on it, and supply that in MatMPIAIJSetPreallocation.
> Otherwise it has to allocate and copy memory often.
>
> You could increase your f9 on a serial run and see what runs best and
> then move to parallel with a value in f6 of about 1/2 of f9.
>
> On Tue, Jan 29, 2019 at 9:13 PM Yaxiong Chen <chen2018 at purdue.edu> wrote:
>
> Thanks Mark,
>
> I use PETSC 3.9.4, is this the optimized version you called?
>
> Actually f9 and f6 are from the PETSC example. I don't know how to set
> the value correctly so I commend them. The size of my elemental matrix may
> vary. For 2D problem, the size of elemental matrix can be 24*24 or 32*32 or
> some other sizes. And the index is not continuous. In this case, the
> elemental matrix may interlace with each other in the global matrix, and I
> may have thousands of elemental matrix to be assembled. Does the
> preallocating suitable for this?
>
>
> Yaxiong Chen,
> Ph.D. Student
>
> School of Mechanical Engineering, 3171
>
> 585 Purdue Mall
>
> West Lafayette, IN 47907
>
>
>
>
>
> ------------------------------
> *From:* Mark Adams <mfadams at lbl.gov>
> *Sent:* Tuesday, January 29, 2019 8:25 PM
> *To:* Yaxiong Chen
> *Cc:* Song Gao; petsc-users at mcs.anl.gov
> *Subject:* Re: [petsc-users] PETSC matrix assembling super slow
>
> Slow assembly is often from not preallocating correctly. I am guessing
> that you are using Q1 element and f9==9, and thus the preallocation should
> be OK if this is a scalar problem on a regular grid and f6-==6 should be OK
> for the off processor allocation, if my assumptions are correct.
>
> You can run with -info, which will tell you how many allocation were done
> in assembly. Make sure that it is small (eg, 0).
>
> I see you use f90 array stuff 'idx-1'. Compilers can sometimes do crazy
> things with seeming simple code. You could just do this manually if you can
> find anything else.
>
> And I trust you are running an optimized version of PETSc.
>
>
> On Tue, Jan 29, 2019 at 6:22 PM Yaxiong Chen via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
> Hi Song,
>
> I don't quite understand how I can use this command. I don't partition the
> gloabl matrix. If I add my elemental matrix to the global system it will be
> like this. And in my parallel part, I use each core to generate the
> elemental matrix in turn. In this case, I guess each core will be assigned
> the space for global matrix and finally be assembled.But according to the
> manual, it seems each core will store a part of the global matrix. Is the local
> submatrix in the MatMPIAIJSetPreallocation
> <https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMPIAIJSetPreallocation.html#MatMPIAIJSetPreallocation>
> (Mat
> <https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/Mat.html#Mat>
> B,PetscInt
> <https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscInt.html#PetscInt>
> d_nz,const PetscInt
> <https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscInt.html#PetscInt>
> d_nnz[],PetscInt
> <https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscInt.html#PetscInt>
> o_nz,const PetscInt
> <https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscInt.html#PetscInt>
> o_nnz[])the same as my elemental matrix?
>
>
>
> Thanks
>
>
> Yaxiong Chen,
> Ph.D. Student
>
> School of Mechanical Engineering, 3171
>
> 585 Purdue Mall
>
> West Lafayette, IN 47907
>
>
>
>
>
> ------------------------------
> *From:* Song Gao <song.gao2 at mail.mcgill.ca>
> *Sent:* Tuesday, January 29, 2019 1:22 PM
> *To:* Yaxiong Chen
> *Cc:* Matthew Knepley; petsc-users at mcs.anl.gov
> *Subject:* Re: [petsc-users] PETSC matrix assembling super slow
>
> I think you would prefer to preallocate the matrix
>
> uncomment this line
> ! call
> MatMPIAIJSetPreallocation(Amat,f9,PETSC_NULL_INTEGER,f6,PETSC_NULL_INTEGER,
> ierr)
>
>
>
> Le mar. 29 janv. 2019, à 12 h 40, Yaxiong Chen via petsc-users <
> petsc-users at mcs.anl.gov> a écrit :
>
> Hello,
>
>
> I have a 2D system which is assembled by each elemental matrix. Ae is my
> elemental matrix, auxRHSe(:) and RHSe(:) and corresponding right hand side,
> idx is the global index. My code is as follow, however ,the assembling rate
> is super slow(Marked red in the code). I am not sure whether the assembling
> type is right or not. Since for each element, idx are not continuous
> numbers. Do you have any idea what is the better way to assemble the matrix?
>
>
> Thanks
>
>
> block
> PetscErrorCode ierr
> PetscMPIInt rank,nproc, mystart
> PetscInt nelem
> integer,allocatable ::info(:)
> real(wp), allocatable :: Ae(:,:), auxRHSe(:),RHSe(:)
> integer, allocatable :: idx(:)
> PetscScalar, pointer :: xx_v(:)
>
> PC prec
> PetscScalar :: val
> Vec xvec,bvec,uvec
> Mat Amat
> KSP ksp
> PetscViewer viewer
> PetscInt geq,j,k,ne,M,Istart,Iend
> PetscBool flg
> KSPConvergedReason reason
> Vec dummyVec, dummyVecs(1)
> MatNullSpace nullspace
>
> call PetscInitialize( PETSC_NULL_CHARACTER, ierr )
>
> if (ierr .ne. 0) then
> print*,'Unable to initialize PETSc'
> stop
> endif
> call MPI_Comm_size(PETSC_COMM_WORLD,nproc,ierr)
> call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr)
> mystart=rank+1
>
> ! Parameter set
> info=ptsystem%getInitInfo()
> nelem =
> info(Info_EleMatNum)+info(Info_FixedDOFNum)+info(Info_NumConstrain)
> print*,'nelem',nelem
> !-------------------------------------
> ! Create Matrix
> call MatCreate(PETSC_COMM_WORLD,Amat,ierr)
> call MatSetSizes( Amat,PETSC_DECIDE, PETSC_DECIDE, info(1), info(1),
> ierr )
> call MatSetType( Amat, MATMPIBAIJ, ierr )
> ! call
> MatMPIAIJSetPreallocation(Amat,f9,PETSC_NULL_INTEGER,f6,PETSC_NULL_INTEGER,
> ierr)
> call MatSetFromOptions( Amat, ierr )
> call MatSetUp( Amat, ierr )
> call MatGetOwnershipRange( Amat, Istart, Iend, ierr )
>
> xvec = tVec(0)
> call MatCreateVecs( Amat, PETSC_NULL_VEC, xvec, ierr )
> call VecSetFromOptions( xvec, ierr )
> call VecDuplicate( xvec, bvec, ierr )
> call VecDuplicate( xvec, uvec, ierr )
>
> t1 = MPI_WTime();
>
> do i=mystart,nelem,nproc
> call ptSystem%getElementalMAT(i, Ae, auxRHSe, idx)
> ne=size(idx)
> if (allocated(auxRHSe)) call
> VecSetValues(bvec,ne,idx-1,auxRHSe,ADD_VALUES,ierr)
> call MatSetValues(Amat,ne,idx-1,ne,idx-1,Ae,ADD_VALUES,ierr)
> end do
>
> nelem = info(Info_EleRHSNum)
> mystart = rank+1
>
> do i = mystart, nelem, nproc
> call ptSystem%getElementalRHS(i, RHSe, idx)
> print*,'idx',idx
> ne=size(idx)
> if (allocated(RHSe)) call
> VecSetValues(bvec,ne,idx-1,RHSe,ADD_VALUES,ierr)
> end do
> call MatAssemblyBegin(Amat,MAT_FINAL_ASSEMBLY,ierr)
> call MatAssemblyEnd(Amat,MAT_FINAL_ASSEMBLY,ierr)
> ! this part is slow, the for loop above is done but here it may get
> stuck
> call VecAssemblyBegin(bvec,ierr)
> ! For a 2500 DOF system, assembling only
> takes over 2 seconds
> call VecAssemblyEnd(bvec,ierr)
> ! But for a 10000 DOF system , it gets stuck
> t2 = MPI_WTime();
> print*,'assembeling time',t2-t1
> ! Solve
> call KSPCreate(PETSC_COMM_WORLD,ksp,ierr)
> ! Set operators. Here the matrix that defines the linear system
> ! also serves as the preconditioning matrix.
> call KSPSetOperators(ksp,Amat,Amat,ierr)
> call KSPSetFromOptions(ksp,ierr)
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> ! Solve the linear system
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> call KSPSetType(ksp,KSPCG,ierr)
> call KSPGetPC(ksp,prec,ierr)
> ! call KSPSetPCSide(ksp,PC_SYMMETRIC,ierr)
> call PCSetType(prec,PCJACOBI,ierr)
> call KSPSolve(ksp,bvec,xvec,ierr)
> call PetscFinalize(ierr)
>
> end block
>
>
> Yaxiong Chen,
> Ph.D. Student
>
> School of Mechanical Engineering, 3171
>
> 585 Purdue Mall
>
> West Lafayette, IN 47907
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190205/52de4ff3/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pastedImage.png
Type: image/png
Size: 3840 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190205/52de4ff3/attachment-0001.png>
More information about the petsc-users
mailing list