[petsc-users] How to assemble a sparse SPD matrix in parallel

Tue May 7 13:04:43 CDT 2013

On May 7, 2013, at 7:59 AM, Thomas Ponweiser <Thomas.Ponweiser at risc-software.at> wrote:

> Dear PETSc community!
>  
> I would like to read in a (large) sparse SPD matrix from a file in parallel. More precisely my plan was to do the following:
>  
> 1) Read matrix size N from file.
> 2) Create PETSc matrix.
> 3) Set option MAT_SPD=PETSC_TRUE.
> 4) Set global size N x N, local sizes PETSC_DECIDE.
> 5) Read in only those rows from file, which are owned by the local process.

    Having many processes independently reading from different parts of the file will not be efficient using normal UNIX io. This will make it terribly terribly slow.

    I recommend writing a sequential program that reads in the matrix and save it with MatView() to a binary viewer then using MatLoad() to read it in in parallel. Thus you do not have to do any complicated coding and will get efficient loading.

    Barry

> 6) Preallocate the matrix using statistics collected in the previous step.
> 7) Insert the values read into the matrix row-by-row.
> 8) Begin and finish matrix assembly.
>  
> My problem is in step 5, leading to 3 questions:
>  
> QUESTION 1: How can I let PETSc decide, which rows of the global matrix will be local to the process BEFORE prealloction?
>  
> In the manual pages I have found so far:
> A) MatGetOwnershipRange():
> “requires that the matrix be preallocated”.
> B) MatGetOwnershipRanges():
> “Not collective, unless matrix has not been allocated, then collective on Mat”
> However, when running the program, I get the error message: “Must call MatXXXSetPreallocation() or MatSetUp() … before MatGetOwnershipRanges()!”
>  
> QUESTION 2: Is the documentation of MatGetOwnershipRanges() incorrect or am I misinterpreting it?
> -> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetOwnershipRanges.html
>  
> I finally got the program running by calling MatSetUp() before MatGetOwnershipRange(). Still I cannot fight the feeling that I am doing things not as they have been intended by the developers, since preallocation is now done twice.
>  
> The alternative seems to be to use PetscSplitOwnership() and MPI_Scan() to calculate the row ranges for each process before creating the matrix with MatCreate(). But this leads in any case to a very even distribution of row counts among the processes. Assuming that only the upper triangular part of the symmetric matrix needs to be stored (IS THIS CORRECT?), I would guess that consequently this leads to an imbalance regarding the number of (nonzero) matrix entries owned by each process (Processes with higher rank will own fewer nonzeros).
>  
> QUESTION 3: For SPD matrices, is it in general a good strategy to have every process owning approximately the same number of rows? (In this case, I can of course forget about PetscSplitOwnership() and MPI_Scan() and do the distribution myself).
>  
> Thank you and kind regards,
> Thomas Ponweiser