parallel solvers and matrix structure

Khan, Irfan irfan.khan at gatech.edu
Mon Jan 26 10:21:37 CST 2009


Are you doing substructuring yourself or just using petsc parallel solver ? 

I am not doing any substructuring, however, I am using the "-mat_partitioning_type parmetis" option to divide the geometry. I then assemble the global stiffness matrix and global load vector and use petsc parallel solver. 

Thanks 
Irfan 



--- On Mon, 1/26/09, Khan, Irfan <irfan.khan at gatech.edu> wrote: 


From: Khan, Irfan <irfan.khan at gatech.edu> 
Subject: Re: parallel solvers and matrix structure 
To: "PETSc users list" <petsc-users at mcs.anl.gov> 
Date: Monday, January 26, 2009, 12:04 AM 

Thanks for the quick and helpful suggestions. I have used MatSetOptions() using 
MAT_IGNORE_ZERO_ENTRIES with -pc_type sor. This gives me the same Stiffness 
matrix in both cases (with and without excluding zero entries). Further the 
results are the same, including the number of iterations. 

However, the number of iterations to convergence is 70. Earlier I used to
 use 
PCASM with KSPCG. This required 17 iterations. Barry mentioned 
"spooles" with pc_type cholesky. But I am skeptical about its performs 
on systems with over a million degrees of freedom. My matrices are positive 
definite, thus CG may be ideally suited. But I am not sure what kind of 
preconditioner would best suit in parallel. 

Any suggestions on preconditioner-solver combinations for very large (1-10 
million dof) positive definite systems? 

Thanks again 
Best Regards 
Irfan 


----- Original Message ----- 
From: "Barry Smith" <bsmith at mcs.anl.gov> 
To: "PETSc users list" <petsc-users at mcs.anl.gov> 
Sent: Sunday, January 25, 2009 9:42:42 PM GMT -05:00 US/Canada Eastern 
Subject: Re: parallel solvers and matrix structure 


On Jan 25, 2009, at 8:19 PM, Khan, Irfan wrote: 

> Dear Petsc team 
> 
> Firstly, thanks for developing PETSc. I have been using it to 
>
 parallelize a linear finite element code to use with a parallel 
> lattice Boltzmann code. It has helped me a lot untill now. 
> 
> I have some questions about the way the parallel solvers handles 
> matrices with varying zeros in the matrix. 
> 
> I have found that the more MatSetValue() commands I use in my code 
> the slower it gets. Therefore I initialize the parallel Stiffness 
> matrix to 0.0. 

I don't know what you mean by this. You should NOT call 
MatZeroEntries() on the matrix initially. This will destroy your 
preallocation information. 
Call MatZeroEntries() after you have filled it up and used it, when 
you are ready to start again. 

> I then fill in the values using an "if" conditions to eliminate 
zero 
> entries into the parallel Stiffness matrix. This reduces the number 
> of times MatSetValue() is called greatly (5-15 times). I also 
> approximate the number of non-zero entries into the parallel matrix, 
> that I create using MatCreateMPIAIJ. I have attached outputs of 
> running my code with "-info"; one with the "if" 
condition and the 
> other without the conditions (at the end of the email). 

If you are using the SeqAIJ or MPIAIJ matrices then you can simply 
call MatSetOption(mat,MAT_IGNORE_ZERO_ENTRIES) 
and it will not put in those locations with zero value. 


> 
> 
> I have also compared the matrices generated by using the "if" 
> condition and the one generated without the "if" condition. They 
are 
> the same except for the extra zero entries in the later case. But 
> what I realize is that during solving, the matrix generated with the 
> "if" condition is not converging while the matrix generated 
without 
> the "if" conditions converges in 17 iterations. I use
 KSPCG. 

First run both cases with -pc_type lu and confirm that they both 
produce the same solution to the linear system (on one process). 

Also confirm that the matrix is symmetric positive definite so you 
can use CG. 

The reason for the difference in converging is because the default 
solver that PETSc uses is ILU(0). The "quality" of an ILU 
preconditioner depends on the 
"nonzero" structure of the matrix, the more "nonzero" 
locations the 
more likely there will be convergence and it will be faster. For ILU a 
zero in a location is different 
than not having that location at all. Likely you want to use a better 
preconditioner; what about -pc_type sor? For moderate size problems, 
say < 100,000 you may 
want to use Cholesky direct solver. In parallel run config/ 
configure.py --download-spooles then run the code with -pc_type 
cholesky -pc_factor_mat_solver_package
 spooles 


Barry 

> 
> 
> It would help me a lot if I can get some suggestions on how to use 
> MatSetValue() optimally, the reason for KSPCG failing to converge 
> and if something can be done. Also any suggestions on if I should 
> not use an "if" condition to enter values into the Stiffness 
matrix 
> to eliminate zero entries. 
> 
> I will be glad to send any other information if needed. 
> 
> Thanks in advance 
> Best Regards 
> Irfan 
> 
> 
> 
> 
> WITHOUT "IF" CONDITION 
> < [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 3464 
> < [0] MatAssemblyBegin_MPIAIJ(): Stash has 432 entries, uses 0 
> mallocs. 
> < [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: 
> 306 unneeded,4356 used 
> < [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during 
> MatSetValues()
 is 216 
> < [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 
> < [0] Mat_CheckInode(): Found 14 nodes of 66. Limit used: 5. Using 
> Inode routines 
> < [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: 
> 465 unneeded,3576 used 
> < [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during 
> MatSetValues() is 183 
> < [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 
> < [1] Mat_CheckInode(): Found 23 nodes of 66. Limit used: 5. Using 
> Inode routines 
> 
> < [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: 
> 243 unneeded,972 used 
> < [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during 
> MatSetValues() is 72 
> < [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 
> < [1] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 12)/ 
>
 (num_localrows 66) < 0.6. Do not use CompressedRow routines. 
> 
> < [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: 
> 504 unneeded,1116 used 
> < [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during 
> MatSetValues() is 99 
> < [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 
> < [0] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 0)/ 
> (num_localrows 66) < 0.6. Do not use CompressedRow routines. 
> 
> 
> WITH "IF" CONDITION 
>> [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 632 
>> [0] MatAssemblyBegin_MPIAIJ(): Stash has 78 entries, uses 0 mallocs. 
>> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: 
>> 118 unneeded,1304 used 
>> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during 
>> MatSetValues() is 0 
>> [0] MatAssemblyEnd_SeqAIJ(): Maximum
 nonzeros in any row is 26 
>> [0] Mat_CheckInode(): Found 66 nodes out of 66 rows. Not using 
>> Inode routines 
>> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: 
>> 353 unneeded,1033 used 
>> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during 
>> MatSetValues() is 6 
>> [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 26 
>> [1] Mat_CheckInode(): Found 64 nodes out of 66 rows. Not using 
>> Inode routines 
> 
>> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 24; storage space: 
>> 14 unneeded,121 used 
>> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during 
>> MatSetValues() is 0 
>> [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 12 
>> [1] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 48)/ 
>> (num_localrows 66) > 0.6. Use CompressedRow
 routines. 
> 
>> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 18; storage space: 
>> 14 unneeded,121 used 
>> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during 
>> MatSetValues() is 0 
>> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 11 
>> [0] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 42)/ 
>> (num_localrows 66) > 0.6. Use CompressedRow routines. 
> 
> 
> 
> 
> 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090126/4b69c127/attachment-0001.htm>


More information about the petsc-users mailing list