Slow assembly

John R. Wicks jwicks at cs.brown.edu
Mon Oct 29 05:22:52 CDT 2007


Although I have the malloc problem fixed, I'm wondering about the number of
messages during assembly:
[0] MatStashScatterBegin_Private(): No of messages: 0 
[0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[1] MatStashScatterBegin_Private(): No of messages: 1 
[1] MatStashScatterBegin_Private(): Mesg_to: 0: size: 272 
[1] MatAssemblyBegin_MPIAIJ(): Stash has 33 entries, uses 0 mallocs.
[3] MatStashScatterBegin_Private(): No of messages: 3 
[3] MatStashScatterBegin_Private(): Mesg_to: 0: size: 360 
[3] MatStashScatterBegin_Private(): Mesg_to: 1: size: 328 
[3] MatStashScatterBegin_Private(): Mesg_to: 2: size: 360 
[3] MatAssemblyBegin_MPIAIJ(): Stash has 128 entries, uses 0 mallocs.
[0] MatAssemblyEnd_SeqAIJ(): Matrix size: 11 X 11; storage space: 0
unneeded,11 used
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[6] MatStashScatterBegin_Private(): No of messages: 6 
[6] MatStashScatterBegin_Private(): Mesg_to: 0: size: 184 
[6] MatStashScatterBegin_Private(): Mesg_to: 1: size: 168 
[6] MatStashScatterBegin_Private(): Mesg_to: 2: size: 184 
[6] MatStashScatterBegin_Private(): Mesg_to: 3: size: 184 
[6] MatStashScatterBegin_Private(): Mesg_to: 4: size: 152 
[6] MatStashScatterBegin_Private(): Mesg_to: 5: size: 168 
[6] MatAssemblyBegin_MPIAIJ(): Stash has 124 entries, uses 0 mallocs.
[2] MatStashScatterBegin_Private(): No of messages: 2 
[2] MatStashScatterBegin_Private(): Mesg_to: 0: size: 96 
[2] MatStashScatterBegin_Private(): Mesg_to: 1: size: 88 
[2] MatAssemblyBegin_MPIAIJ(): Stash has 21 entries, uses 0 mallocs.
[2] MatAssemblyEnd_SeqAIJ(): Matrix size: 11 X 11; storage space: 0
unneeded,11 used
[2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[5] MatStashScatterBegin_Private(): No of messages: 5 
[5] MatStashScatterBegin_Private(): Mesg_to: 0: size: 360 
[5] MatStashScatterBegin_Private(): Mesg_to: 1: size: 328 
[5] MatStashScatterBegin_Private(): Mesg_to: 2: size: 360 
[5] MatStashScatterBegin_Private(): Mesg_to: 3: size: 360 
[5] MatStashScatterBegin_Private(): Mesg_to: 4: size: 296 
[5] MatAssemblyBegin_MPIAIJ(): Stash has 208 entries, uses 0 mallocs.
[5] MatAssemblyEnd_SeqAIJ(): Matrix size: 10 X 10; storage space: 0
unneeded,10 used
[5] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[5] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[6] MatAssemblyEnd_SeqAIJ(): Matrix size: 10 X 10; storage space: 0
unneeded,10 used
[6] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[6] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[1] MatAssemblyEnd_SeqAIJ(): Matrix size: 10 X 10; storage space: 0
unneeded,10 used
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[4] MatStashScatterBegin_Private(): No of messages: 4 
[4] MatStashScatterBegin_Private(): Mesg_to: 0: size: 184 
[4] MatStashScatterBegin_Private(): Mesg_to: 1: size: 168 
[4] MatStashScatterBegin_Private(): Mesg_to: 2: size: 184 
[4] MatStashScatterBegin_Private(): Mesg_to: 3: size: 184 
[4] MatAssemblyBegin_MPIAIJ(): Stash has 86 entries, uses 0 mallocs.
[3] MatAssemblyEnd_SeqAIJ(): Matrix size: 11 X 11; storage space: 0
unneeded,11 used
[3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[4] MatAssemblyEnd_SeqAIJ(): Matrix size: 9 X 9; storage space: 0 unneeded,9
used
[4] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[4] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[7] MatStashScatterBegin_Private(): No of messages: 7 
[7] MatStashScatterBegin_Private(): Mesg_to: 0: size: 184 
[7] MatStashScatterBegin_Private(): Mesg_to: 1: size: 168 
[7] MatStashScatterBegin_Private(): Mesg_to: 2: size: 184 
[7] MatStashScatterBegin_Private(): Mesg_to: 3: size: 184 
[7] MatStashScatterBegin_Private(): Mesg_to: 4: size: 152 
[7] MatStashScatterBegin_Private(): Mesg_to: 5: size: 168 
[7] MatStashScatter[6] PetscCommDuplicate(): Using internal PETSc
communicator 1140850689 -2080374784
[6] PetscCommDuplicate():   returning tag 2147483643
[5] PetscCommDuplicate(): Using internal PETSc communicator 1140850689
-2080374784
[5] PetscCommDuplicate():   returning tag 2147483643
[2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689
-2080374784
[2] PetscCommDuplicate():   returning tag 2147483643
[6] PetscCommDuplicate():   returning tag 2147483643
[5] PetscCommDuplicate():   returning tag 2147483643
[2] PetscCommDuplicate():   returning tag 2147483643
Begin_Private(): Mesg_to: 6: size: 168 
[7] MatAssemblyBegin_MPIAIJ(): Stash has 144 entries, uses 0 mallocs.
[7] MatAssemblyEnd_SeqAIJ(): Matrix size: 8 X 8; storage space: 4 unneeded,4
used
[7] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[7] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689
-2080374784
[1] PetscCommDuplicate():   returning tag 2147483643
[7] PetscCommDuplicate(): Using internal PETSc communicator 1140850689
-2080374784
[7] PetscCommDuplicate():   returning tag 2147483643
[3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689
-2080374784
[3] PetscCommDuplicate():   returning tag 2147483643
[7] PetscCommDuplicate():   returning tag 2147483643
[1] PetscCommDuplicate():   returning tag 2147483643
[3] PetscCommDuplicate():   returning tag 2147483643
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689
-2080374784
[0] PetscCommDuplicate():   returning tag 2147483643
[0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter
[0] PetscCommDuplicate():   returning tag 2147483643
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689
-2080374784
[0] PetscCommDuplicate():   returning tag 2147483642
[0] PetscCommDuplicate():   returning tag 2147483642
[0] PetscCommDuplicate():   returning tag 2147483637
[0] VecScatterCreateCommon_PtoS(): Using blocksize 1 scatter
[0] VecScatterCreate(): General case: MPI to Seq
[0] MatSetOption_Inode(): Not using Inode routines due to
MatSetOption(MAT_DO_NOT_USE_INODES
[0] MatAssemblyEnd_SeqAIJ(): Matrix size: 11 X 8; storage space: 0
unneeded,14 used
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 2

What is this "stash"?  Does this mean that it is sending matrix entries
between processes (b/c I should be only setting entries local to each
process), or some other kind of meta-information?

> -----Original Message-----
> From: owner-petsc-users at mcs.anl.gov 
> [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith
> Sent: Friday, October 26, 2007 7:54 PM
> To: petsc-users at mcs.anl.gov
> Subject: RE: Slow assembly
> 
> 
> 
>   The sorting does not matter.
> 
>   Under normal conditions the MatAssembly should take a 
> fraction of a second. The only cause that we know that slows 
> it down to the extreme you have is that it is sending a huge 
> amount of data across processes (the -info option Satish 
> suggested will tell us if that is true). 
> 
>   Are you only call MatAssemblyBegin/End() once? You should, 
> don't call it 
> multiple times. 
> 
>   The sorting is not important (in fact it takes advantage of 
> it automatically and does not need to be set).
> 
>    Barry
> 
> 
> On Fri, 26 Oct 2007, John R. Wicks wrote:
> 
> > I have confirmed that I am calling MatSetValues() for local 
> rows only 
> > and am only setting each value exactly once.
> > 
> > Because of how the matrix was partitioned for another non-Petsc 
> > program, each partition is partitioned (by columns) into 32 blocks 
> > (corresponding to the row partitions).  I enter the data for each 
> > block one row at a time, i.e., for any one SetValues call, 
> the entries 
> > are sorted by increasing column index.  Does that mean I can use 
> > MatrixSetOption(A,MAT_COLUMNS_SORTED).  Should that help?
> > 
> > P.S.: I tried it, and it still seems to be taking quite a long time.
> > 
> > > -----Original Message-----
> > > From: owner-petsc-users at mcs.anl.gov
> > > [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Satish Balay
> > > Sent: Friday, October 26, 2007 3:04 PM
> > > To: petsc-users at mcs.anl.gov
> > > Subject: Re: Slow assembly
> > > 
> > > 
> > > On Fri, 26 Oct 2007, John R. Wicks wrote:
> > > 
> > > > I am working on computing PageRank for a web scale graph
> > > which uses a
> > > > square matrix which is 1.2x10^8 dimensional with about 10^9
> > > entries.
> > > > I have partitioned the matrix for 32 processors myself 
> into my own
> > > > ascii format, and I know the memory allocation, so I:
> > > > 
> > > > 1) create the matrix with "A = MatCreateMPIAIJ(*n, *n, *N,
> > > *N, 0, nnz,
> > > > 0, onnz)",
> > > > 2) load the entries by repeatedly calling
> > > > "MatSetValues(A,1,&row,links,cols,vals,INSERT_VALUES)", and
> > > > 
> > > > 3) call MatAssemblyBegin/End.
> > > > 
> > > > Steps 1 and 2 complete in a couple minutes, but step 3 is taking
> > > > several hours.  What is going on?  Is there a way to speed 
> > > up matrix
> > > > assembly?
> > > 
> > > Are you makeing sure that you call MatGetOwnershipRange() -
> > > and calling MatSetValues() for mostly local rows only?
> > > 
> > > Also can you confirm that multiple processes [for eg: proc-0
> > > and proc-1 etc..]  are not setting the same value [i.e both 
> > > of them calling MatSetValues(row=0,col=0)]
> > > 
> > > Satish
> > > 
> > 
> > 
> 




More information about the petsc-users mailing list