[petsc-users] Which preconditioners are scalable?

Barry Smith bsmith at mcs.anl.gov
Fri Mar 11 13:00:49 CST 2011


  Having trouble understanding your email.

   What is N? Is that the number of processes? 

   What does the notation 5'912'016 mean?

  Are the numbers in your table from a particular process? Or are they summed over all processes?


   The intention is that the ASM is memory scalable so that if for example you double the number of processes and double the total number of nonzeros in the matrix (probably by doubling the total number of rows and columns in the matrix) each process should require essentially the same amount of memory.  But what happens in practice for a particular problem will, to some degree, depend on the amount of coupling between processes in the matrix (hence how much bigger the local overlapped matrix is then the original matrix on that process) and depend on how the domain is sliced up.  But even with a "bad" slicing I would not expect the amount of local memory needed to double.  I think you need to determine more completely what all this memory is being used for. If you find a scalability problem we definitely want to fix it because we intend as you say:

> I thought that
> 'only' the local matrices, plus some constant overlap with neighbors,
> are solved, so that memory consumption should stay constant when I scale
> up with a constant number of rows per process.


  Barry




On Mar 11, 2011, at 9:52 AM, Sebastian Steiger wrote:

> Hello PETSc developers
> 
> I'm doing some scaling benchmarks and I found that the parallel asm
> preconditioner, my favorite preconditioner, has a limit in the number of
> cores it can handle.
> 
> I am doing a numerical experiment where I scale up the size of my matrix
> by roughly the same factor as the number of CPUs employed. When I look
> at which function used how much memory using PETSc's routine
> PetscMallocDumpLog, I see the following:
> 
> Function name                        N=300         N=600     increase
> ======================================================================
> MatGetSubMatrices_MPIAIJ_Local    75'912'016   134'516'928    1.77
> MatIncreaseOverlap_MPIAIJ_Once    168'288'288  346'870'832    2.06
> MatIncreaseOverlap_MPIAIJ_Receive  2'918'960     5'658'160    1.94
> 
> The matrix sizes are 6'899'904 and 14'224'896, respectively. Above
> N~5000 CPUs I am running out of memory.
> 
> Here's my question now: Is the asm preconditioner limited from the
> algorithm point of view, or is it the implementation? I thought that
> 'only' the local matrices, plus some constant overlap with neighbors,
> are solved, so that memory consumption should stay constant when I scale
> up with a constant number of rows per process.
> 
> Best
> Sebastian
> 



More information about the petsc-users mailing list