[petsc-users] [petsc-dev] boomerAmg scalability

Ravi Kannan rxk at cfdrc.com
Wed Jan 18 14:03:43 CST 2012


Hi Mark, Hong,

 

As you might remember, the reason for this whole exercise was to obtain a
solution for a very stiff problem.

 

We did have Hypre Boomer amg. This did not scale, but gives correct
solution. So we wanted an alternative; hence we approached you for gamg.

 

However for certain cases, gamg crashes. Even for the working cases, it
takes about 15-20 times more sweeps than the boomer-hypre. Hence it is
cost-prohibitive.

 

Hopefully this gamg solver can be improved in the near future, for users
like us. 

 

Warm Regards,

Ravi. 

 

 

From: Mark F. Adams [mailto:mark.adams at columbia.edu] 
Sent: Wednesday, January 18, 2012 9:56 AM
To: Hong Zhang
Cc: rxk at cfdrc.com
Subject: Re: [petsc-dev] boomerAmg scalability

 

Hong and Ravi,

 

I fixed a bug with the 6x6 problem.  There seemed to be a bug in
MatTranposeMat with funny decomposition, that was not really verified.  So
we can wait for Ravi to continue with his tests a fix them as they arise.

 

Mark

ps, Ravi, I may not have cc'ed so I will send again.

 

On Jan 17, 2012, at 7:37 PM, Hong Zhang wrote:





Ravi,

I wrote a simple test ex163.c (attached) on MatTransposeMatMult().

Loading your 6x6 matrix gives no error from MatTransposeMatMult()

using 1,2,...7 processes.

For example,

 

petsc-dev/src/mat/examples/tests>mpiexec -n 4 ./ex163 -f
/Users/hong/Downloads/repetscdevboomeramgscalability/binaryoutput

A:

Matrix Object: 1 MPI processes

  type: mpiaij

row 0: (0, 1.66668e+06)  (1, -1.35)  (3, -0.6) 

row 1: (0, -1.35)  (1, 1.66667e+06)  (2, -1.35)  (4, -0.6) 

row 2: (1, -1.35)  (2, 1.66667e+06)  (5, -0.6) 

row 3: (0, -0.6)  (3, 1.66668e+06)  (4, -1.35) 

row 4: (1, -0.6)  (3, -1.35)  (4, 1.66667e+06)  (5, -1.35) 

row 5: (2, -0.6)  (4, -1.35)  (5, 1.66667e+06) 

 

C = A^T * A:

Matrix Object: 1 MPI processes

  type: mpiaij

row 0: (0, 2.77781e+12)  (1, -4.50002e+06)  (2, 1.8225)  (3, -2.00001e+06)
(4, 1.62) 

row 1: (0, -4.50002e+06)  (1, 2.77779e+12)  (2, -4.50001e+06)  (3, 1.62)
(4, -2.00001e+06)  (5, 1.62) 

row 2: (0, 1.8225)  (1, -4.50001e+06)  (2, 2.7778e+12)  (4, 1.62)  (5,
-2.00001e+06) 

row 3: (0, -2.00001e+06)  (1, 1.62)  (3, 2.77781e+12)  (4, -4.50002e+06)
(5, 1.8225) 

row 4: (0, 1.62)  (1, -2.00001e+06)  (2, 1.62)  (3, -4.50002e+06)  (4,
2.77779e+12)  (5, -4.50001e+06) 

row 5: (1, 1.62)  (2, -2.00001e+06)  (3, 1.8225)  (4, -4.50001e+06)  (5,
2.7778e+12)

 

Do I miss something?

 

Hong

 

On Sat, Jan 14, 2012 at 3:37 PM, Mark F. Adams <mark.adams at columbia.edu>
wrote:

Ravi, this system is highly diagonally dominate.  I've fixed the code so you
can pull and try again.

 

I've decided to basically just do a one level method with DD systems.  I
don't know if that is the best semantics, I think Barry will hate it,
because it gives you a one level solver when you asked for MG.  It now picks
up the coarse grid solver as the solver, which is wrong, so I need to fix
this if we decide to stick with the current semantics.

 

And again thanks for helping to pound on this code.

 

Mark

 

On Jan 13, 2012, at 6:33 PM, Ravi Kannan wrote:

 

Hi Mark, Hong,

 

Lets make it simpler. I fixed my partitiotion bug (in metis). Now there is a
equidivision of cells.

 

To simplify even further, lets run a much smaller case : with 6 cells
(equations) in SERIAL. This one crashes. The out and the ksp_view_binary
files are attached.

 

Thanks,

RAvi.

 

From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov]
On Behalf Of Mark F. Adams
Sent: Friday, January 13, 2012 3:00 PM
To: For users of the development version of PETSc
Subject: Re: [petsc-dev] boomerAmg scalability

 

Well, we do have a bug here.  It should work with zero elements on a proc,
but the code is being actively developed so you are really helping us to
find these cracks.

 

If its not too hard it would be nice if you could give use these matrices,
before you fix it, so we can fix this bug.  You can just send it to Hong and
I (cc'ed).

 

Mark

 

On Jan 13, 2012, at 12:16 PM, Ravi Kannan wrote:

 

Hi Mark,Hong

 

Thanks for the observation w.r.t the proc 0 having 2 equations. This is a
bug from our end. We will fix it and get back to you if needed.

 

Thanks,

Ravi.

 

From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov]
On Behalf Of Mark F. Adams
Sent: Thursday, January 12, 2012 10:03 PM
To: Hong Zhang
Cc: For users of the development version of PETSc
Subject: Re: [petsc-dev] boomerAmg scalability

 

Ravi, can you run with -ksp_view_binary? This will produce two files.

 

Hong, ex10 will read in these files and solve them.  I will probably not be
able to get to this until Monday.

 

Also, this matrix has just two equations on proc 0 and and about 11000 on
proc 1 so its is strangely balanced, in case that helps ...

 

Mark

 

On Jan 12, 2012, at 10:35 PM, Hong Zhang wrote:





Ravi,

 

I need more info for debugging. Can you provide a simple stand alone code
and matrices in petsc

binary format that reproduce the error?

 

MatTransposeMatMult() for mpiaij is a newly developed subroutine - less than
one month old 

and not well tested yet :-(

I used petsc-dev/src/mat/examples/tests/ex94.c for testing.

 

Thanks,

 

Hong

On Thu, Jan 12, 2012 at 9:17 PM, Mark F. Adams <mark.adams at columbia.edu>
wrote:

It looks like the problem is in MatTransposeMatMult and Hong (cc'ed) is
working on it.

 

I'm hoping that your output will be enough for Hong to figure this out but I
could not reproduce this problem with any of my tests.

 

If Hong can not figure this out then we will need to get the matrix from you
to reproduce this.

 

Mark

 

 

On Jan 12, 2012, at 6:25 PM, Ravi Kannan wrote:





Hi Mark,

 

Any luck with the gamg bug fix?

 

Thanks,

Ravi.

 

From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov]
On Behalf Of Mark F. Adams
Sent: Wednesday, January 11, 2012 1:54 PM
To: For users of the development version of PETSc
Subject: Re: [petsc-dev] boomerAmg scalability

 

This seems to be dying earlier than it was last week, so it looks like a new
bug in MatTransposeMatMult.

 

Mark

 

On Jan 11, 2012, at 1:59 PM, Matthew Knepley wrote:

 

On Wed, Jan 11, 2012 at 12:23 PM, Ravi Kannan <rxk at cfdrc.com> wrote:

Hi Mark,

 

I downloaded the dev version again. This time, the program crashes even
earlier. Attached is the serial and parallel info outputs.

 

Could you kindly take a look.

 

It looks like this is a problem with MatMatMult(). Can you try to reproduce
this using KSP ex10? You put

your matrix in binary format and use -pc_type gamg. Then you can send us the
matrix and we can track

it down. Or are you running an example there?

 

  Thanks,

 

    Matt

 

 

 

Thanks,

Ravi.

 

From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov]
On Behalf Of Mark F. Adams
Sent: Monday, January 09, 2012 3:08 PM


To: For users of the development version of PETSc
Subject: Re: [petsc-dev] boomerAmg scalability

 

 

Yes its all checked it, just pull from dev.

Mark

 

On Jan 9, 2012, at 2:54 PM, Ravi Kannan wrote:

 

Hi Mark,

 

Thanks for your efforts.

 

Do I need to do the install from scratch once again? Or some particular
files (check out gamg.c for instance)?

 

Thanks,

Ravi.

 

From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov]
On Behalf Of Mark F. Adams
Sent: Friday, January 06, 2012 10:30 AM
To: For users of the development version of PETSc
Subject: Re: [petsc-dev] boomerAmg scalability

 

I think I found the problem.  You will need to use petsc-dev to get the fix.

 

Mark

 

On Jan 6, 2012, at 8:55 AM, Mark F. Adams wrote:

 

Ravi, I forgot but you can just use -ksp_view_binary to output the matrix
data (two files).  You could run it with two procs and a Jacobi solver to
get it past the solve, where it writes the matrix (I believe).

Mark

 

On Jan 5, 2012, at 6:19 PM, Ravi Kannan wrote:

 

Just send in another email with the attachment.

 

From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at mcs.anl.gov]
On Behalf Of Jed Brown
Sent: Thursday, January 05, 2012 5:15 PM
To: For users of the development version of PETSc
Subject: Re: [petsc-dev] boomerAmg scalability

 

On Thu, Jan 5, 2012 at 17:12, Ravi Kannan <rxk at cfdrc.com> wrote:

I have attached the verbose+info outputs for both the serial and the
parallel (2 partitions). NOTE: the serial output at some location says
PC=Jacobi! Is it implicitly converting the PC to a Jacobi?

 

Looks like you forgot the attachment.

 

 

 





 

-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener

 

 

 

 

 

<out><binaryoutput><binaryoutput.info <http://binaryoutput.info/> >

 

 

<ex163.c>

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120118/965a1679/attachment-0001.htm>


More information about the petsc-users mailing list