[petsc-dev] many subdomains per process

Barry Smith bsmith at mcs.anl.gov
Sat Feb 6 19:39:30 CST 2010

    You could try running with -malloc_log to see where all the memory  
is being malloced by PETSc.


On Feb 6, 2010, at 4:21 PM, Jed Brown wrote:

> Sometimes I like to "simulate" having a big machine in an interactive
> environment, mostly to investigate algorithmic scalability for high
> process counts.  You can oversubscribe a small machine up to a point,
> but kernels don't work so well when they have thousands of processes
> trying to do memory-bound operations.
> So I'll take, e.g. an 8-core box with 8 GiB of memory, and do things
> like the following
>  mpiexec -n 8 ./ex48 -M 4 -P 3 -thi_nlevels 5 -thi_hom z -thi_L 5e3 - 
> ksp_monitor -ksp_converged_reason -snes_monitor -log_summary - 
> mg_levels_pc_type asm -mg_levels_pc_asm_blocks 8192 -thi_mat_type baij
>  Level 0 domain size (m)    5e+03 x    5e+03 x    1e+03, num  
> elements   4x  4x  3 (      48), size (m) 1250 x 1250 x 500
>  Level 1 domain size (m)    5e+03 x    5e+03 x    1e+03, num  
> elements   8x  8x  5 (     320), size (m) 625 x 625 x 250
>  Level 2 domain size (m)    5e+03 x    5e+03 x    1e+03, num  
> elements  16x 16x  9 (    2304), size (m) 312.5 x 312.5 x 125
>  Level 3 domain size (m)    5e+03 x    5e+03 x    1e+03, num  
> elements  32x 32x 17 (   17408), size (m) 156.25 x 156.25 x 62.5
>  Level 4 domain size (m)    5e+03 x    5e+03 x    1e+03, num  
> elements  64x 64x 33 (  135168), size (m) 78.125 x 78.125 x 31.25
> These are absurdly small subdomains, with only 33 dofs per subdomain  
> on
> the fine level, and 0.078 dofs per subdomain on level 1, but I'm still
> unhappy to see thousands of lines of this crap coming from Parmetis:
>        ***Cannot bisect a graph with 0 vertices!
>        ***You are trying to partition a graph into too many parts!
> I even get some of this with the slightly less contrived
>  mpiexec -n 8 ./ex48 -M 16 -P 9 -thi_nlevels 3 -thi_hom z -thi_L 5e3  
> -ksp_monitor -ksp_converged_reason -snes_monitor -log_summary - 
> mg_levels_pc_type asm -mg_levels_pc_asm_blocks 8192 -thi_mat_type  
> baij -dmmg_view -snes_max_it 3 -thi_verbose -mg_coarse_pc_type hypre
> Here, level 1 has 4.25 dofs (2.25 nodes) in each subdomain, so I
> wouldn't expect empty subdomains.  Apparently Parmetis is still
> producing a usable partition because the solver works, but I'm curious
> how to prevent this outburst.  It's apprently not as simple as just
> checking that there are as many nodes as requested subdomains.  Is  
> this
> something worth working around on the PETSc side, or should I ask
> upstream?
> Finally, I have a concern over memory usage.  When I run with these  
> huge
> subdomain counts, I see huge memory spike at setup.  This is  
> independent
> of Parmetis:
>  mpiexec -n 8 ./ex48 -M 24 -P 7 -thi_nlevels 3 -thi_hom z -thi_L  
> 10e3 -ksp_monitor -ksp_converged_reason -snes_monitor -log_summary - 
> mg_levels_pc_type asm -mg_levels_pc_asm_blocks 8192 -thi_mat_type  
> baij -dmmg_view -thi_verbose -mg_coarse_pc_type hypre - 
> thi_mat_partitioning_type square -mat_partitioning_type square
> Each process briefly goes up to about 1000 MB resident, then drops to
> about 100 MB, and finally climbs slowly to stabilize at 480 MB (once
> matrices are assembled and factored).  I haven't tracked down the
> source, but there is clearly a huge allocation, all the pages are
> faulted, and then released very soon afterwards.  That memory doesn't
> seem to be attributable to any objects because the usage below only  
> adds
> up to approximately the stable resident size (nowhere near the huge
> spike).
>  Memory usage is given in bytes:
>  Object Type          Creations   Destructions   Memory   
> Descendants' Mem.
>  Reports information only for process 0.
>  --- Event Stage 0: Main Stage
>   Toy Hydrostatic Ice     1              1  588.000000     0
>     Distributed array     6              6  352752.000000     0
>                   Vec  6250           6250  41393472.000000     0
>           Vec Scatter  6163           6163  3645548.000000     0
>             Index Set 24614          24614  21788136.000000     0
>     IS L to G Mapping  2060           2060  143594928.000000     0
>                Matrix  6165           6165  360639852.000000     0
>   Matrix Partitioning     2              2  896.000000     0
>                  SNES     3              3  3096.000000     0
>         Krylov Solver  2057           2057  1781272.000000     0
>        Preconditioner  2057           2057  1448592.000000     0
>                Viewer     3              3  1632.000000     0
> Jed

More information about the petsc-dev mailing list