Slow assembly

Barry Smith bsmith at mcs.anl.gov
Mon Oct 29 06:56:42 CDT 2007


  You are assigning some matrix values on a process that is different
fromm the one where it will "live". In general you want to generate 90%+
of matrix elements on the process where they live, otherwise the MatAssembly
will be slow.

   Barry


On Mon, 29 Oct 2007, John R. Wicks wrote:

> Although I have the malloc problem fixed, I'm wondering about the number of
> messages during assembly:
> [0] MatStashScatterBegin_Private(): No of messages: 0 
> [0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [1] MatStashScatterBegin_Private(): No of messages: 1 
> [1] MatStashScatterBegin_Private(): Mesg_to: 0: size: 272 
> [1] MatAssemblyBegin_MPIAIJ(): Stash has 33 entries, uses 0 mallocs.
> [3] MatStashScatterBegin_Private(): No of messages: 3 
> [3] MatStashScatterBegin_Private(): Mesg_to: 0: size: 360 
> [3] MatStashScatterBegin_Private(): Mesg_to: 1: size: 328 
> [3] MatStashScatterBegin_Private(): Mesg_to: 2: size: 360 
> [3] MatAssemblyBegin_MPIAIJ(): Stash has 128 entries, uses 0 mallocs.
> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 11 X 11; storage space: 0
> unneeded,11 used
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
> [6] MatStashScatterBegin_Private(): No of messages: 6 
> [6] MatStashScatterBegin_Private(): Mesg_to: 0: size: 184 
> [6] MatStashScatterBegin_Private(): Mesg_to: 1: size: 168 
> [6] MatStashScatterBegin_Private(): Mesg_to: 2: size: 184 
> [6] MatStashScatterBegin_Private(): Mesg_to: 3: size: 184 
> [6] MatStashScatterBegin_Private(): Mesg_to: 4: size: 152 
> [6] MatStashScatterBegin_Private(): Mesg_to: 5: size: 168 
> [6] MatAssemblyBegin_MPIAIJ(): Stash has 124 entries, uses 0 mallocs.
> [2] MatStashScatterBegin_Private(): No of messages: 2 
> [2] MatStashScatterBegin_Private(): Mesg_to: 0: size: 96 
> [2] MatStashScatterBegin_Private(): Mesg_to: 1: size: 88 
> [2] MatAssemblyBegin_MPIAIJ(): Stash has 21 entries, uses 0 mallocs.
> [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 11 X 11; storage space: 0
> unneeded,11 used
> [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
> [5] MatStashScatterBegin_Private(): No of messages: 5 
> [5] MatStashScatterBegin_Private(): Mesg_to: 0: size: 360 
> [5] MatStashScatterBegin_Private(): Mesg_to: 1: size: 328 
> [5] MatStashScatterBegin_Private(): Mesg_to: 2: size: 360 
> [5] MatStashScatterBegin_Private(): Mesg_to: 3: size: 360 
> [5] MatStashScatterBegin_Private(): Mesg_to: 4: size: 296 
> [5] MatAssemblyBegin_MPIAIJ(): Stash has 208 entries, uses 0 mallocs.
> [5] MatAssemblyEnd_SeqAIJ(): Matrix size: 10 X 10; storage space: 0
> unneeded,10 used
> [5] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [5] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
> [6] MatAssemblyEnd_SeqAIJ(): Matrix size: 10 X 10; storage space: 0
> unneeded,10 used
> [6] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [6] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 10 X 10; storage space: 0
> unneeded,10 used
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
> [4] MatStashScatterBegin_Private(): No of messages: 4 
> [4] MatStashScatterBegin_Private(): Mesg_to: 0: size: 184 
> [4] MatStashScatterBegin_Private(): Mesg_to: 1: size: 168 
> [4] MatStashScatterBegin_Private(): Mesg_to: 2: size: 184 
> [4] MatStashScatterBegin_Private(): Mesg_to: 3: size: 184 
> [4] MatAssemblyBegin_MPIAIJ(): Stash has 86 entries, uses 0 mallocs.
> [3] MatAssemblyEnd_SeqAIJ(): Matrix size: 11 X 11; storage space: 0
> unneeded,11 used
> [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
> [4] MatAssemblyEnd_SeqAIJ(): Matrix size: 9 X 9; storage space: 0 unneeded,9
> used
> [4] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [4] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
> [7] MatStashScatterBegin_Private(): No of messages: 7 
> [7] MatStashScatterBegin_Private(): Mesg_to: 0: size: 184 
> [7] MatStashScatterBegin_Private(): Mesg_to: 1: size: 168 
> [7] MatStashScatterBegin_Private(): Mesg_to: 2: size: 184 
> [7] MatStashScatterBegin_Private(): Mesg_to: 3: size: 184 
> [7] MatStashScatterBegin_Private(): Mesg_to: 4: size: 152 
> [7] MatStashScatterBegin_Private(): Mesg_to: 5: size: 168 
> [7] MatStashScatter[6] PetscCommDuplicate(): Using internal PETSc
> communicator 1140850689 -2080374784
> [6] PetscCommDuplicate():   returning tag 2147483643
> [5] PetscCommDuplicate(): Using internal PETSc communicator 1140850689
> -2080374784
> [5] PetscCommDuplicate():   returning tag 2147483643
> [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689
> -2080374784
> [2] PetscCommDuplicate():   returning tag 2147483643
> [6] PetscCommDuplicate():   returning tag 2147483643
> [5] PetscCommDuplicate():   returning tag 2147483643
> [2] PetscCommDuplicate():   returning tag 2147483643
> Begin_Private(): Mesg_to: 6: size: 168 
> [7] MatAssemblyBegin_MPIAIJ(): Stash has 144 entries, uses 0 mallocs.
> [7] MatAssemblyEnd_SeqAIJ(): Matrix size: 8 X 8; storage space: 4 unneeded,4
> used
> [7] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [7] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
> [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689
> -2080374784
> [1] PetscCommDuplicate():   returning tag 2147483643
> [7] PetscCommDuplicate(): Using internal PETSc communicator 1140850689
> -2080374784
> [7] PetscCommDuplicate():   returning tag 2147483643
> [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689
> -2080374784
> [3] PetscCommDuplicate():   returning tag 2147483643
> [7] PetscCommDuplicate():   returning tag 2147483643
> [1] PetscCommDuplicate():   returning tag 2147483643
> [3] PetscCommDuplicate():   returning tag 2147483643
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689
> -2080374784
> [0] PetscCommDuplicate():   returning tag 2147483643
> [0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter
> [0] PetscCommDuplicate():   returning tag 2147483643
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689
> -2080374784
> [0] PetscCommDuplicate():   returning tag 2147483642
> [0] PetscCommDuplicate():   returning tag 2147483642
> [0] PetscCommDuplicate():   returning tag 2147483637
> [0] VecScatterCreateCommon_PtoS(): Using blocksize 1 scatter
> [0] VecScatterCreate(): General case: MPI to Seq
> [0] MatSetOption_Inode(): Not using Inode routines due to
> MatSetOption(MAT_DO_NOT_USE_INODES
> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 11 X 8; storage space: 0
> unneeded,14 used
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 2
> 
> What is this "stash"?  Does this mean that it is sending matrix entries
> between processes (b/c I should be only setting entries local to each
> process), or some other kind of meta-information?
> 
> > -----Original Message-----
> > From: owner-petsc-users at mcs.anl.gov 
> > [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith
> > Sent: Friday, October 26, 2007 7:54 PM
> > To: petsc-users at mcs.anl.gov
> > Subject: RE: Slow assembly
> > 
> > 
> > 
> >   The sorting does not matter.
> > 
> >   Under normal conditions the MatAssembly should take a 
> > fraction of a second. The only cause that we know that slows 
> > it down to the extreme you have is that it is sending a huge 
> > amount of data across processes (the -info option Satish 
> > suggested will tell us if that is true). 
> > 
> >   Are you only call MatAssemblyBegin/End() once? You should, 
> > don't call it 
> > multiple times. 
> > 
> >   The sorting is not important (in fact it takes advantage of 
> > it automatically and does not need to be set).
> > 
> >    Barry
> > 
> > 
> > On Fri, 26 Oct 2007, John R. Wicks wrote:
> > 
> > > I have confirmed that I am calling MatSetValues() for local 
> > rows only 
> > > and am only setting each value exactly once.
> > > 
> > > Because of how the matrix was partitioned for another non-Petsc 
> > > program, each partition is partitioned (by columns) into 32 blocks 
> > > (corresponding to the row partitions).  I enter the data for each 
> > > block one row at a time, i.e., for any one SetValues call, 
> > the entries 
> > > are sorted by increasing column index.  Does that mean I can use 
> > > MatrixSetOption(A,MAT_COLUMNS_SORTED).  Should that help?
> > > 
> > > P.S.: I tried it, and it still seems to be taking quite a long time.
> > > 
> > > > -----Original Message-----
> > > > From: owner-petsc-users at mcs.anl.gov
> > > > [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Satish Balay
> > > > Sent: Friday, October 26, 2007 3:04 PM
> > > > To: petsc-users at mcs.anl.gov
> > > > Subject: Re: Slow assembly
> > > > 
> > > > 
> > > > On Fri, 26 Oct 2007, John R. Wicks wrote:
> > > > 
> > > > > I am working on computing PageRank for a web scale graph
> > > > which uses a
> > > > > square matrix which is 1.2x10^8 dimensional with about 10^9
> > > > entries.
> > > > > I have partitioned the matrix for 32 processors myself 
> > into my own
> > > > > ascii format, and I know the memory allocation, so I:
> > > > > 
> > > > > 1) create the matrix with "A = MatCreateMPIAIJ(*n, *n, *N,
> > > > *N, 0, nnz,
> > > > > 0, onnz)",
> > > > > 2) load the entries by repeatedly calling
> > > > > "MatSetValues(A,1,&row,links,cols,vals,INSERT_VALUES)", and
> > > > > 
> > > > > 3) call MatAssemblyBegin/End.
> > > > > 
> > > > > Steps 1 and 2 complete in a couple minutes, but step 3 is taking
> > > > > several hours.  What is going on?  Is there a way to speed 
> > > > up matrix
> > > > > assembly?
> > > > 
> > > > Are you makeing sure that you call MatGetOwnershipRange() -
> > > > and calling MatSetValues() for mostly local rows only?
> > > > 
> > > > Also can you confirm that multiple processes [for eg: proc-0
> > > > and proc-1 etc..]  are not setting the same value [i.e both 
> > > > of them calling MatSetValues(row=0,col=0)]
> > > > 
> > > > Satish
> > > > 
> > > 
> > > 
> > 
> 
> 




More information about the petsc-users mailing list