[petsc-users] [petsc-dev] boomerAmg scalability

Mark F. Adams mark.adams at columbia.edu
Wed Jan 18 16:00:04 CST 2012


15-20 times more iterations is huge.

There are a few things to try.  GAMG can get confused when it does an eigen solve by scaling of BC equations.

-ksp_diagonal_scale 

should fix this.  If the differences are still huge then I would have to take a look at the matrix.

The default solver type is a simpler less optimal method.  I debate what to make the default but this is another method:

-pc_gamg_type sa

This should not make a huge difference (2-3x at most).

Also, if you run with -pc_gamg_verbose, this is small and useful.

Mark

On Jan 18, 2012, at 12:03 PM, Ravi Kannan wrote:

> Hi Mark, Hong,
>  
> As you might remember, the reason for this whole exercise was to obtain a solution for a very stiff problem.
>  
> We did have Hypre Boomer amg. This did not scale, but gives correct solution. So we wanted an alternative; hence we approached you for gamg.
>  
> However for certain cases, gamg crashes. Even for the working cases, it takes about 15-20 times more sweeps than the boomer-hypre. Hence it is cost-prohibitive.
>  
> Hopefully this gamg solver can be improved in the near future, for users like us.
>  
> Warm Regards,
> Ravi.
>  
>  
> From: Mark F. Adams [mailto:mark.adams at columbia.edu] 
> Sent: Wednesday, January 18, 2012 9:56 AM
> To: Hong Zhang
> Cc: rxk at cfdrc.com
> Subject: Re: [petsc-dev] boomerAmg scalability
>  
> Hong and Ravi,
>  
> I fixed a bug with the 6x6 problem.  There seemed to be a bug in MatTranposeMat with funny decomposition, that was not really verified.  So we can wait for Ravi to continue with his tests a fix them as they arise.
>  
> Mark
> ps, Ravi, I may not have cc'ed so I will send again.
>  
> On Jan 17, 2012, at 7:37 PM, Hong Zhang wrote:
> 
> 
> Ravi,
> I wrote a simple test ex163.c (attached) on MatTransposeMatMult().
> Loading your 6x6 matrix gives no error from MatTransposeMatMult()
> using 1,2,...7 processes.
> For example,
>  
> petsc-dev/src/mat/examples/tests>mpiexec -n 4 ./ex163 -f /Users/hong/Downloads/repetscdevboomeramgscalability/binaryoutput
> A:
> Matrix Object: 1 MPI processes
>   type: mpiaij
> row 0: (0, 1.66668e+06)  (1, -1.35)  (3, -0.6) 
> row 1: (0, -1.35)  (1, 1.66667e+06)  (2, -1.35)  (4, -0.6) 
> row 2: (1, -1.35)  (2, 1.66667e+06)  (5, -0.6) 
> row 3: (0, -0.6)  (3, 1.66668e+06)  (4, -1.35) 
> row 4: (1, -0.6)  (3, -1.35)  (4, 1.66667e+06)  (5, -1.35) 
> row 5: (2, -0.6)  (4, -1.35)  (5, 1.66667e+06) 
>  
> C = A^T * A:
> Matrix Object: 1 MPI processes
>   type: mpiaij
> row 0: (0, 2.77781e+12)  (1, -4.50002e+06)  (2, 1.8225)  (3, -2.00001e+06)  (4, 1.62) 
> row 1: (0, -4.50002e+06)  (1, 2.77779e+12)  (2, -4.50001e+06)  (3, 1.62)  (4, -2.00001e+06)  (5, 1.62) 
> row 2: (0, 1.8225)  (1, -4.50001e+06)  (2, 2.7778e+12)  (4, 1.62)  (5, -2.00001e+06) 
> row 3: (0, -2.00001e+06)  (1, 1.62)  (3, 2.77781e+12)  (4, -4.50002e+06)  (5, 1.8225) 
> row 4: (0, 1.62)  (1, -2.00001e+06)  (2, 1.62)  (3, -4.50002e+06)  (4, 2.77779e+12)  (5, -4.50001e+06) 
> row 5: (1, 1.62)  (2, -2.00001e+06)  (3, 1.8225)  (4, -4.50001e+06)  (5, 2.7778e+12)
>  
> Do I miss something?
>  
> Hong
>  
> On Sat, Jan 14, 2012 at 3:37 PM, Mark F. Adams <mark.adams at columbia.edu> wrote:
> Ravi, this system is highly diagonally dominate.  I've fixed the code so you can pull and try again.
>  
> I've decided to basically just do a one level method with DD systems.  I don't know if that is the best semantics, I think Barry will hate it, because it gives you a one level solver when you asked for MG.  It now picks up the coarse grid solver as the solver, which is wrong, so I need to fix this if we decide to stick with the current semantics.
>  
> And again thanks for helping to pound on this code.
>  
> Mark
>  
> On Jan 13, 2012, at 6:33 PM, Ravi Kannan wrote:
>  
> Hi Mark, Hong,
>  
> Lets make it simpler. I fixed my partitiotion bug (in metis). Now there is a equidivision of cells.
>  
> To simplify even further, lets run a much smaller case : with 6 cells (equations) in SERIAL. This one crashes. The out and the ksp_view_binary files are attached.
>  
> Thanks,
> RAvi.
>  
> From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] On Behalf Of Mark F. Adams
> Sent: Friday, January 13, 2012 3:00 PM
> To: For users of the development version of PETSc
> Subject: Re: [petsc-dev] boomerAmg scalability
>  
> Well, we do have a bug here.  It should work with zero elements on a proc, but the code is being actively developed so you are really helping us to find these cracks.
>  
> If its not too hard it would be nice if you could give use these matrices, before you fix it, so we can fix this bug.  You can just send it to Hong and I (cc'ed).
>  
> Mark
>  
> On Jan 13, 2012, at 12:16 PM, Ravi Kannan wrote:
>  
> 
> Hi Mark,Hong
>  
> Thanks for the observation w.r.t the proc 0 having 2 equations. This is a bug from our end. We will fix it and get back to you if needed.
>  
> Thanks,
> Ravi.
>  
> From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] On Behalf Of Mark F. Adams
> Sent: Thursday, January 12, 2012 10:03 PM
> To: Hong Zhang
> Cc: For users of the development version of PETSc
> Subject: Re: [petsc-dev] boomerAmg scalability
>  
> Ravi, can you run with -ksp_view_binary? This will produce two files.
>  
> Hong, ex10 will read in these files and solve them.  I will probably not be able to get to this until Monday.
>  
> Also, this matrix has just two equations on proc 0 and and about 11000 on proc 1 so its is strangely balanced, in case that helps ...
>  
> Mark
>  
> On Jan 12, 2012, at 10:35 PM, Hong Zhang wrote:
> 
> 
> 
> Ravi,
>  
> I need more info for debugging. Can you provide a simple stand alone code and matrices in petsc
> binary format that reproduce the error?
>  
> MatTransposeMatMult() for mpiaij is a newly developed subroutine - less than one month old 
> and not well tested yet :-(
> I used petsc-dev/src/mat/examples/tests/ex94.c for testing.
>  
> Thanks,
>  
> Hong
> 
> On Thu, Jan 12, 2012 at 9:17 PM, Mark F. Adams <mark.adams at columbia.edu> wrote:
> It looks like the problem is in MatTransposeMatMult and Hong (cc'ed) is working on it.
>  
> I'm hoping that your output will be enough for Hong to figure this out but I could not reproduce this problem with any of my tests.
>  
> If Hong can not figure this out then we will need to get the matrix from you to reproduce this.
>  
> Mark
>  
>  
> On Jan 12, 2012, at 6:25 PM, Ravi Kannan wrote:
> 
> 
> 
> Hi Mark,
>  
> Any luck with the gamg bug fix?
>  
> Thanks,
> Ravi.
>  
> From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] On Behalf Of Mark F. Adams
> Sent: Wednesday, January 11, 2012 1:54 PM
> To: For users of the development version of PETSc
> Subject: Re: [petsc-dev] boomerAmg scalability
>  
> This seems to be dying earlier than it was last week, so it looks like a new bug in MatTransposeMatMult.
>  
> Mark
>  
> On Jan 11, 2012, at 1:59 PM, Matthew Knepley wrote:
>  
> 
> On Wed, Jan 11, 2012 at 12:23 PM, Ravi Kannan <rxk at cfdrc.com> wrote:
> Hi Mark,
>  
> I downloaded the dev version again. This time, the program crashes even earlier. Attached is the serial and parallel info outputs.
>  
> Could you kindly take a look.
>  
> It looks like this is a problem with MatMatMult(). Can you try to reproduce this using KSP ex10? You put
> your matrix in binary format and use -pc_type gamg. Then you can send us the matrix and we can track
> it down. Or are you running an example there?
>  
>   Thanks,
>  
>     Matt
>  
>  
>  
> Thanks,
> Ravi.
>  
> From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] On Behalf Of Mark F. Adams
> Sent: Monday, January 09, 2012 3:08 PM
> 
> To: For users of the development version of PETSc
> Subject: Re: [petsc-dev] boomerAmg scalability
>  
>  
> Yes its all checked it, just pull from dev.
> Mark
>  
> On Jan 9, 2012, at 2:54 PM, Ravi Kannan wrote:
>  
> 
> Hi Mark,
>  
> Thanks for your efforts.
>  
> Do I need to do the install from scratch once again? Or some particular files (check out gamg.c for instance)?
>  
> Thanks,
> Ravi.
>  
> From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] On Behalf Of Mark F. Adams
> Sent: Friday, January 06, 2012 10:30 AM
> To: For users of the development version of PETSc
> Subject: Re: [petsc-dev] boomerAmg scalability
>  
> I think I found the problem.  You will need to use petsc-dev to get the fix.
>  
> Mark
>  
> On Jan 6, 2012, at 8:55 AM, Mark F. Adams wrote:
>  
> 
> Ravi, I forgot but you can just use -ksp_view_binary to output the matrix data (two files).  You could run it with two procs and a Jacobi solver to get it past the solve, where it writes the matrix (I believe).
> Mark
>  
> On Jan 5, 2012, at 6:19 PM, Ravi Kannan wrote:
>  
> 
> Just send in another email with the attachment.
>  
> From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov] On Behalf Of Jed Brown
> Sent: Thursday, January 05, 2012 5:15 PM
> To: For users of the development version of PETSc
> Subject: Re: [petsc-dev] boomerAmg scalability
>  
> On Thu, Jan 5, 2012 at 17:12, Ravi Kannan <rxk at cfdrc.com> wrote:
> I have attached the verbose+info outputs for both the serial and the parallel (2 partitions). NOTE: the serial output at some location says PC=Jacobi! Is it implicitly converting the PC to a Jacobi?
>  
> Looks like you forgot the attachment.
>  
>  
>  
> 
> 
>  
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
>  
>  
>  
>  
>  
> <out><binaryoutput><binaryoutput.info>
>  
>  
> <ex163.c>
>  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120118/b2c2cc19/attachment-0001.htm>


More information about the petsc-users mailing list