[petsc-users] Question on writing a large matrix

Barry Smith bsmith at mcs.anl.gov
Thu Apr 21 12:25:33 CDT 2011


On Apr 21, 2011, at 11:59 AM, S V N Vishwanathan wrote:

> 
>> What is 'painfully slow'.  Do you have a profile or an estimate in
>> terms of GB/s?  Have you taken a look at your process's memory
>> allocation and checked to see if it is swapping?  My first guess would
>> be that you are exceeding RAM and your program is thrashing as parts
>> of the page table get swapped to and from disk mid-run.
> 
> A single machine does not have enough memory to hold the entire
> matrix. That is why I have to assemble it in parallel. When distributed
> across 8 machines the assembly seemed to finish in under an hr.

   It has not assembled the matrix in an hour. It is working all night to assemble the matrix, the problem is that you are not preallocating the  nonzeros per row with MatMPIAIJSetPreallocation() when pre allocation is correct it will always print 0 for Number of mallocs. The actual writing of the parallel matrix to the binary file will take at most minutes.

   Barry


> However,
> my program tried to write the matrix to file since yesterday night and
> eventually crashed. The log just indicated
> 
> [1]PETSC ERROR: Caught signal number 1 Hang up: Some other process (or the batch system) has told this process to end
> 
> Most likely because it tried to allocate a large chunk of memory and
> failed. 
> 
> I investigated using a smaller matrix and ran the code with the -info
> flag (see below). What worries me are these lines:
> 
> Writing data in binary format to adult9.train.x 
> ....  >>>> I call MatView in my code here 
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 16281
> 
> Is MatView reconstructing the matrix at the root node? In that case the
> program will definitely fail due to lack of memory. 
> 
> Please let me know if I you need any other information or if I can run
> any other tests to help investigate.
> 
> vishy
> 
> 
> 
> 
> mpiexec -n 2 ./libsvm-to-binary -in ../LibSVM/biclass/adult9/adult9.train.txt -data adult9.train.x -labels adult9.train.y -info 
> 
> [0] PetscInitialize(): PETSc successfully started: number of processors = 2
> [1] PetscInitialize(): PETSc successfully started: number of processors = 2
> [1] PetscInitialize(): Running on machine: rossmann-fe03.rcac.purdue.edu
> [0] PetscInitialize(): Running on machine: rossmann-fe03.rcac.purdue.edu
> No libsvm test file specified!
> 
> Reading libsvm train file at ../LibSVM/biclass/adult9/adult9.train.txt
> [0] PetscFOpen(): Opening file ../LibSVM/biclass/adult9/adult9.train.txt
> [1] PetscFOpen(): Opening file ../LibSVM/biclass/adult9/adult9.train.txt
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374780 max tags = 2147483647
> [0] PetscCommDuplicate():   returning tag 2147483647
> [1] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374782 max tags = 2147483647
> [1] PetscCommDuplicate():   returning tag 2147483647
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374780
> [0] PetscCommDuplicate():   returning tag 2147483642
> [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374782
> [1] PetscCommDuplicate():   returning tag 2147483642
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374777 max tags = 2147483647
> [1] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374780 max tags = 2147483647
> [1] PetscCommDuplicate():   returning tag 2147483647
> [0] PetscCommDuplicate():   returning tag 2147483647
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374777
> [0] PetscCommDuplicate():   returning tag 2147483646
> [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374780
> [1] PetscCommDuplicate():   returning tag 2147483646
> [0] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs.
> [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs.
> [0] MatStashScatterBegin_Private(): No of messages: 0 
> [0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [1] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 16281 X 124; storage space: 225806 unneeded,0 used
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 0
> [1] Mat_CheckInode(): Found 3257 nodes of 16281. Limit used: 5. Using Inode routines
> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 16280 X 124; storage space: 0 unneeded,225786 used
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 14
> [0] Mat_CheckInode(): Found 16280 nodes out of 16280 rows. Not using Inode routines
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374777
> [0] PetscCommDuplicate():   returning tag 2147483645
> [0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter
> [0] PetscCommDuplicate():   returning tag 2147483638
> [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374780
> [1] PetscCommDuplicate():   returning tag 2147483645
> [1] PetscCommDuplicate():   returning tag 2147483638
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374777
> [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374780
> [1] PetscCommDuplicate():   returning tag 2147483644
> [1] PetscCommDuplicate():   returning tag 2147483637
> [0] PetscCommDuplicate():   returning tag 2147483644
> [0] PetscCommDuplicate():   returning tag 2147483637
> [1] PetscCommDuplicate():   returning tag 2147483632
> [0] PetscCommDuplicate():   returning tag 2147483632
> [0] VecScatterCreateCommon_PtoS(): Using blocksize 1 scatter
> [0] VecScatterCreate(): General case: MPI to Seq
> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 16280 X 0; storage space: 0 unneeded,0 used
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 0
> 
> Writing data in binary format to adult9.train.x 
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374780
> [0] PetscCommDuplicate():   returning tag 2147483628
> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 16281 X 123; storage space: 18409 unneeded,225806 used
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 16281
> [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 14
> [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374782
> [1] PetscCommDuplicate():   returning tag 2147483628
> [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850689
> [1] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374780
> [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374780
> [1] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374780
> [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374782
> [1] PetscCommDuplicate():   returning tag 2147483627
> [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850689
> [0] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374777
> [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374777
> [0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374777
> 
> Writing labels in binary format to adult9.train.y 
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374780
> [0] PetscCommDuplicate():   returning tag 2147483627
> [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688
> [1] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374782
> [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374782
> [1] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374782
> [1] PetscFinalize(): PetscFinalize() called
> [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688
> [0] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374780
> [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374780
> [0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374780
> [0] PetscFinalize(): PetscFinalize() called
> 
> 
>> 
>> On Thu, Apr 21, 2011 at 5:39 PM, S V N Vishwanathan <vishy at stat.purdue.edu> wrote:
>> 
>>    Hi
>> 
>>    I am using the attached code to convert a matrix from a rather
>>    inefficient ascii format (each line is a row and contains a series of
>>    idx:val pairs) to the PETSc binary format. Some of the matrices that I
>>    am working with are rather huge (50GB ascii file) and cannot be
>>    assembled on a single processor. When I use the attached code the matrix
>>    assembly across machines seems to be fairly fast. However, dumping the
>>    assembled matrix out to disk seems to be painfully slow. Any suggestions
>>    on how to speed things up will be deeply appreciated.
>> 
>>    vishy
>> 
>> 
> 



More information about the petsc-users mailing list