[petsc-users] Why my petsc program get stuck and hang there when assembling the matrix?
Fangbo Wang
fangbowa at buffalo.edu
Sun Feb 26 22:37:48 CST 2017
I got my finite element mesh from a commercial finite element software
ABAQUS. I simply draw the geometry of the model in the graphical interface
and assign element types and material properties to different parts of the
model, ABAQUS will automatically output the element and node information of
the model.
Suppose I have 1000 elements in my model and 10 MPI processes,
#1 to #100 local element matrices will be computed in MPI process 0;
#101 to #200 local element matrices will be computed in MPI process 1;
#201 to #300 local element matrices will be computed in MPI process 2;
..........
#901 to #1000 local element matrices will be computed in MPI process 9;
However, I might get a lot of global matrix indices which I need to send to
other processors due to the degree of freedom ordering in the finite
element model.
This is what I did according to my understanding of finite element and what
I have seen.
Do you have some nice libraries or packages that can be easily used in
scientific computing environment?
Thank you very much!
Fangbo Wang
On Sun, Feb 26, 2017 at 11:15 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
> > On Feb 26, 2017, at 10:04 PM, Fangbo Wang <fangbowa at buffalo.edu> wrote:
> >
> > My problem is a solid mechanics problem using finite element method to
> discretize the model ( a 30mX30mX30m soil domain with a building structure
> on top).
> >
> > I am not manually deciding which MPI process compute which matrix
> enties. Because I know Petsc can automaticaly communicate between these
> processors.
> > I am just asking each MPI process generate certain number of matrix
> entries regardless of which process will finally store them.
>
> The standard way to handle this for finite elements is to partition the
> elements among the processes and then partition the nodes (rows of the
> degrees of freedom) subservient to the partitioning of the elements.
> Otherwise most of the matrix (or vector) entries must be communicated and
> this is not scalable.
>
> So how are you partitioning the elements (for matrix stiffness
> computations) and the nodes between processes?
> >
> > Actually, I constructed another matrix with same size but generating
> much less entries, and the code worked. However, it gets stuck when I
> generate more matrix entries.
> >
> > thank you very much! Any suggestion is highly appreciated.
> >
> > BTW, what is the meaning of "[4] MatCheckCompressedRow(): Found the
> ratio (num_zerorows 0)/(num_localrows 96812) < 0.6. Do not use
> CompressedRow routines."? I know compressed row format is commonly used for
> sparse matrix, why don't use compressed row routines here?
>
> This is not important.
>
> >
> >
> > Thanks,
> >
> >
> > Fangbo Wang
> >
> >
> >
> > On Sun, Feb 26, 2017 at 10:42 PM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> >
> > How are you generating the matrix entries in parallel? In general you
> can generate any matrix entries on any MPI process and they will be
> automatically transferred to the MPI process that owns the entries
> automatically. BUT if a huge number of matrix entries are computed on one
> process and need to be communicated to another process this may cause
> gridlock with MPI. Based on the huge size of messages from process 12 it
> looks like this is what is happening in your code.
> >
> > Ideally most matrix entries are generated on the process they are
> stored and hence this gridlock does not happen.
> >
> > What type of discretization are you using? Finite differences, finite
> element, finite volume, spectral, something else? How are you deciding
> which MPI process should compute which matrix entries? Once we understand
> this we may be able to suggest a better way to compute the entries.
> >
> > Barry
> >
> > Under normally circumstances 1.3 million unknowns is not a large
> parallel matrix, there may be special features of your matrix that is
> making this difficult.
> >
> >
> >
> > > On Feb 26, 2017, at 9:30 PM, Fangbo Wang <fangbowa at buffalo.edu> wrote:
> > >
> > > Hi,
> > >
> > > I construct a big matrix which is 1.3million by 1.3million which is
> using approximatly 100GB memory. I have a computer with 500GB memory.
> > >
> > > I run the Petsc program and it get stuck when finally assembling the
> matrix. The program is using around 200GB memory only. However, the program
> just get stuck there. Here is the output message when it gets stuck.
> > > .
> > > .
> > > previous outputs not shown here
> > > .
> > > [12] MatStashScatterBegin_Ref(): No of messages: 15
> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 0: size: 271636416 bytes
> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 1: size: 328581552 bytes
> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 2: size: 163649328 bytes
> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 3: size: 95512224 bytes
> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 4: size: 317711616 bytes
> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 5: size: 170971776 bytes
> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 6: size: 254000064 bytes
> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 7: size: 163146720 bytes
> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 8: size: 345150048 bytes
> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 9: size: 163411584 bytes
> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 10: size: 428874816 bytes
> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 11: size: 739711296 bytes
> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 13: size: 435247344 bytes
> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 14: size: 435136752 bytes
> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 15: size: 346167552 bytes
> > > [14] MatAssemblyBegin_MPIAIJ(): Stash has 263158893 entries, uses 14
> mallocs.
> > > [8] MatAssemblyBegin_MPIAIJ(): Stash has 286768572 entries, uses 14
> mallocs.
> > > [12] MatAssemblyBegin_MPIAIJ(): Stash has 291181818 entries, uses 14
> mallocs.
> > > [13] MatStashScatterBegin_Ref(): No of messages: 15
> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 0: size: 271636416 bytes
> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 1: size: 271636416 bytes
> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 2: size: 220594464 bytes
> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 3: size: 51041952 bytes
> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 4: size: 276201408 bytes
> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 5: size: 256952256 bytes
> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 6: size: 198489024 bytes
> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 7: size: 218657760 bytes
> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 8: size: 219686880 bytes
> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 9: size: 288874752 bytes
> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 10: size: 428874816 bytes
> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 11: size: 172579968 bytes
> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 12: size: 639835680 bytes
> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 14: size: 270060144 bytes
> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 15: size: 511244160 bytes
> > > [13] MatAssemblyBegin_MPIAIJ(): Stash has 268522881 entries, uses 14
> mallocs.
> > > [5] MatAssemblyEnd_SeqAIJ(): Matrix size: 96812 X 96812; storage
> space: 89786788 unneeded,7025212 used
> > > [5] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues()
> is 0
> > > [5] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 81
> > > [5] MatCheckCompressedRow(): Found the ratio (num_zerorows
> 0)/(num_localrows 96812) < 0.6. Do not use CompressedRow routines.
> > > [5] MatSeqAIJCheckInode(): Found 32271 nodes of 96812. Limit used: 5.
> Using Inode routines
> > > [4] MatAssemblyEnd_SeqAIJ(): Matrix size: 96812 X 96812; storage
> space: 89841924 unneeded,6970076 used
> > > [4] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues()
> is 0
> > > [4] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 81
> > > [4] MatCheckCompressedRow(): Found the ratio (num_zerorows
> 0)/(num_localrows 96812) < 0.6. Do not use CompressedRow routines.
> > > [4] MatSeqAIJCheckInode(): Found 32272 nodes of 96812. Limit used: 5.
> Using Inode routines
> > >
> > > stuck here!!!!
> > >
> > >
> > > Any one have ideas on this? Thank you very much!
> > >
> > >
> > >
> > > Fangbo Wang
> > >
> > >
> > >
> > > --
> > > Fangbo Wang, PhD student
> > > Stochastic Geomechanics Research Group
> > > Department of Civil, Structural and Environmental Engineering
> > > University at Buffalo
> > > Email: fangbowa at buffalo.edu
> >
> >
> >
> >
> > --
> > Fangbo Wang, PhD student
> > Stochastic Geomechanics Research Group
> > Department of Civil, Structural and Environmental Engineering
> > University at Buffalo
> > Email: fangbowa at buffalo.edu
>
>
--
Fangbo Wang, PhD student
Stochastic Geomechanics Research Group
Department of Civil, Structural and Environmental Engineering
University at Buffalo
Email: *fangbowa at buffalo.edu <fangbowa at buffalo.edu>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170226/449da358/attachment-0001.html>
More information about the petsc-users
mailing list