[petsc-users] superlu_dist and MatSolveTranspose

Antoine De Blois antoine.deblois at aero.bombardier.com
Tue Sep 23 11:51:51 CDT 2014


Morning Hong,

Alright, fully understood. Please keep me posted on that matter.
Regards,
Antoine

-----Message d'origine-----
De : Hong [mailto:hzhang at mcs.anl.gov] 
Envoyé : Tuesday, September 23, 2014 11:49 AM
À : Hong
Cc : Antoine De Blois; Gaetan Kenway; petsc-users at mcs.anl.gov; Sherry Li
Objet : Re: [petsc-users] superlu_dist and MatSolveTranspose

Antoine,
I just find out that superlu_dist does not support MatSolveTransport yet (see Sherry's email below).
Once superlu_dist provides this support, we can add it to the petsc/superlu_dist interface.

Thanks for your patience.

Hong
-------------------------------------------

Hong,
Sorry, the transposed solve is not there yet;  it's not as simple as serial version, because here, it requires to set up entirely different communication pattern.

I will try to find time to do it.

Sherry

On Tue, Sep 23, 2014 at 8:11 AM, Hong <hzhang at mcs.anl.gov> wrote:
>
> Sherry,
> Can superlu_dist be used for solving A^T x = b?
>
> Using the option
> options.Trans = TRANS;
> with the existing petsc-superlu_dist interface, I cannot get correct solution.
>
> Hong


On Mon, Sep 22, 2014 at 12:47 PM, Hong <hzhang at mcs.anl.gov> wrote:
> I'll add it. It would not take too long, just matter of priority.
> I'll try to get it done in a day or two, then let you know when it works.
>
> Hong
>
> On Mon, Sep 22, 2014 at 12:11 PM, Antoine De Blois 
> <antoine.deblois at aero.bombardier.com> wrote:
>> Dear all,
>>
>> Sorry for the delay on this topic.
>>
>> Thank you Gaetan for your suggestion. I had thought about doing that originally, but I had left it out since I thought that a rank owned the entire row of the matrix (and not only the sub-diagonal part). I will certainly give it a try.
>>
>> I still need the MatSolveTranspose since I need the ability to reuse the residual jacobian matrix from the flow (a 1st order approximation of it), which is assembled in a non-transposed format. This way the adjoint system is solved in a pseudo-time step manner, where the product of the exact jacobian matrix and the adjoint vector is used as a source term in the rhs.
>>
>> Hong, do you have an estimation of the time required to implement it in superlu_dist?
>>
>> Best,
>> Antoine
>>
>> -----Message d'origine-----
>> De : Hong [mailto:hzhang at mcs.anl.gov] Envoyé : Friday, August 29, 
>> 2014 9:14 PM À : Gaetan Kenway Cc : Antoine De Blois; 
>> petsc-users at mcs.anl.gov Objet : Re: [petsc-users] superlu_dist and 
>> MatSolveTranspose
>>
>> We can add MatSolveTranspose() to the petsc interface with superlu_dist.
>>
>> Jed,
>> Are you working on it? If not, I can work on it.
>>
>> Hong
>>
>> On Fri, Aug 29, 2014 at 6:14 PM, Gaetan Kenway <gaetank at gmail.com> wrote:
>>> Hi Antoine
>>>
>>> We are also using PETSc for solving adjoint systems resulting from 
>>> CFD. To get around the matSolveTranspose issue we just assemble the 
>>> transpose matrix directly and then call KSPSolve(). If this is 
>>> possible in your application I think it is probably the best 
>>> approach
>>>
>>> Gaetan
>>>
>>>
>>> On Fri, Aug 29, 2014 at 3:58 PM, Antoine De Blois 
>>> <antoine.deblois at aero.bombardier.com> wrote:
>>>>
>>>> Hello Jed,
>>>>
>>>> Thank you for your quick response. So I spent some time to dig 
>>>> deeper into my problem. I coded a shell script that passes through 
>>>> a bunch of ksp_type, pc_type and sub_pc_type. So please disregard 
>>>> the comment about the "does not converge properly for transpose". I 
>>>> had taken that conclusion from my own code (and not from the ex10 
>>>> and extracted matrix), and a KSPSetFromOptions was missing. Apologies for that.
>>>>
>>>> What remains is the performance issue. The MatSolveTranspose takes 
>>>> a very long time to converge. For a matrix of 3 million rows, 
>>>> MatSolveTranspose takes roughly 5 minutes on 64 cpus, whereas the 
>>>> MatSolve is almost instantaneous!. When I gdb my code, petsc seems 
>>>> to be stalled in the MatLUFactorNumeric_SeqAIJ_Inode () for a long time.
>>>> I also did a top on the compute node to check the RAM usage. It was 
>>>> hovering over 2 gig, so memory usage does not seem to be an issue here.
>>>>
>>>> #0  0x00002afe8dfebd08 in MatLUFactorNumeric_SeqAIJ_Inode ()
>>>>    from
>>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/
>>>> li
>>>> bpetsc.so.3.5
>>>> #1  0x00002afe8e07f15c in MatLUFactorNumeric ()
>>>>    from
>>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/
>>>> li
>>>> bpetsc.so.3.5
>>>> #2  0x00002afe8e2afa99 in PCSetUp_ILU ()
>>>>    from
>>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/
>>>> li
>>>> bpetsc.so.3.5
>>>> #3  0x00002afe8e337c0d in PCSetUp ()
>>>>    from
>>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/
>>>> li
>>>> bpetsc.so.3.5
>>>> #4  0x00002afe8e39d643 in KSPSetUp ()
>>>>    from
>>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/
>>>> li
>>>> bpetsc.so.3.5
>>>> #5  0x00002afe8e39e3ee in KSPSolveTranspose ()
>>>>    from
>>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/
>>>> li
>>>> bpetsc.so.3.5
>>>> #6  0x00002afe8e300f8c in PCApplyTranspose_ASM ()
>>>>    from
>>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/
>>>> li
>>>> bpetsc.so.3.5
>>>> #7  0x00002afe8e338c13 in PCApplyTranspose ()
>>>>    from
>>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/
>>>> li
>>>> bpetsc.so.3.5
>>>> #8  0x00002afe8e3a8a84 in KSPInitialResidual ()
>>>>    from
>>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/
>>>> li
>>>> bpetsc.so.3.5
>>>> #9  0x00002afe8e376c32 in KSPSolve_GMRES ()
>>>>    from
>>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/
>>>> li
>>>> bpetsc.so.3.5
>>>> #10 0x00002afe8e39e425 in KSPSolveTranspose ()
>>>>
>>>> For that particular application, I was using:
>>>> ksp_type:                       gmres
>>>> pc_type:                        asm
>>>> sub_pc_type:                    ilu
>>>> adj_sub_pc_factor_levels        1
>>>>
>>>> For small matrices, the MatSolveTranspose computing time is very 
>>>> similar to the simple MatSolve.
>>>>
>>>> And if I want to revert to a MatTranspose followed by the MatSolve, 
>>>> then the MatTranspose takes forever to finish... For a matrix of 3 
>>>> million rows, MatTranspose takes 30 minutes on 64 cpus!!
>>>>
>>>> So thank you for implementing the transpose solve in superlu_dist. 
>>>> It would also be nice to have it with hypre.
>>>> Let me know what you think and ideas on how to improve my 
>>>> computational time, Regards, Antoine
>>>>
>>>> -----Message d'origine-----
>>>> De : Jed Brown [mailto:jed at jedbrown.org] Envoyé : Thursday, August 
>>>> 28, 2014 5:01 PM À : Antoine De Blois; 'petsc-users at mcs.anl.gov'
>>>> Objet : Re: [petsc-users] superlu_dist and MatSolveTranspose
>>>>
>>>> Antoine De Blois <antoine.deblois at aero.bombardier.com> writes:
>>>>
>>>> > Hello everyone,
>>>> >
>>>> > I am trying to solve a A^T x = b system. For my applications, I 
>>>> > had realized that the MatSolveTranspose does not converge properly.
>>>>
>>>> What do you mean "does not converge properly"?  Can you send a test 
>>>> case where the transpose solve should be equivalent, but is not?  
>>>> We have only a few tests for transpose solve and not all 
>>>> preconditioners support it, but where it is supported, we want to ensure that it is correct.
>>>>
>>>> > Therefore, I had implemented a MatTranspose followed by a MatSolve.
>>>> > This proved to converge perfectly (which is strange since the 
>>>> > transposed matrix has the same eigenvalues as the untransposed...).
>>>> > The problem is that for bigger matrices, the MatTranspose is very 
>>>> > costly and thus cannot be used.
>>>>
>>>> Costly in terms of memory?  (I want you to be able to use 
>>>> KSPSolveTranspose, but I'm curious what you're experiencing.)
>>>>
>>>> > I tried using the superlu_dist package. Although it the package 
>>>> > works perfectly for the MatSolve, I get the an "No support for 
>>>> > this operation for this object type" error  with 
>>>> > MatSolveTransopse. I reproduced the error using the MatView an 
>>>> > ex10 tutorial. I can provide the matrix and rhs upon request. My command line was:
>>>> >
>>>> > ex10 -f0 A_and_rhs.bin -pc_type lu -pc_factor_mat_solver_package 
>>>> > superlu_dist -trans
>>>> >
>>>> > So it there an additional parameter I need to use for the 
>>>> > transposed solve?
>>>> >
>>>> > [0]PETSC ERROR: --------------------- Error Message
>>>> > --------------------------------------------------------------
>>>> > [0]PETSC ERROR: No support for this operation for this object 
>>>> > type [0]PETSC ERROR: Matrix type mpiaij
>>>>
>>>> This is easy to add.  I'll do it now.
>>>>
>>>> > [0]PETSC ERROR: See
>>>> > http://www.mcs.anl.gov/petsc/documentation/faq.html
>>>> > for trouble shooting.
>>>> > [0]PETSC ERROR: Petsc Release Version 3.5.1, unknown [0]PETSC ERROR:
>>>> > /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examp
>>>> > le
>>>> > s/t
>>>> > utorials/ex10 on a ARGUS_impi_opt named hpc-user11 by ad007804 
>>>> > Thu Aug
>>>> > 28 16:41:15 2014 [0]PETSC ERROR: Configure options 
>>>> > --CFLAGS="-xHost -axAVX" --download-hypre --download-metis 
>>>> > --download-ml --download-parmetis --download-scalapack 
>>>> > --download-superlu_dist --download-mumps --with-c2html=0 
>>>> > --with-cc=mpiicc --with-fc=mpiifort --with-cxx=mpiicpc 
>>>> > --with-debugging=yes --prefix=/gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/petsc-3.5.
>>>> > 1
>>>> > --with-cmake=/gpfs/fs1/aero/SOFTWARE/TOOLS/CMAKE/cmake-2.8.7/bin/
>>>> > cm
>>>> > ake
>>>> > --with-valgrind=/gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/valgrin
>>>> > d-
>>>> > 3.9 .0/bin/valgrind --with-shared-libraries=0 [0]PETSC ERROR: #1
>>>> > MatSolveTranspose() line 3473 in
>>>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/mat/interface
>>>> > /m atr ix.c [0]PETSC ERROR: #2 PCApplyTranspose_LU() line 214 in 
>>>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/pc/impls/
>>>> > fa cto r/lu/lu.c [0]PETSC ERROR: #3 PCApplyTranspose() line 573 
>>>> > in 
>>>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/pc/interf
>>>> > ac e/p recon.c [0]PETSC ERROR: #4 KSP_PCApply() line 233 in 
>>>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/include/petsc-pri
>>>> > va te/ kspimpl.h [0]PETSC ERROR: #5 KSPInitialResidual() line 63 
>>>> > in 
>>>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/inter
>>>> > fa ce/ itres.c [0]PETSC ERROR: #6 KSPSolve_GMRES() line 234 in 
>>>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/impls
>>>> > /g mre s/gmres.c [0]PETSC ERROR: #7 KSPSolveTranspose() line 704 
>>>> > in 
>>>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/inter
>>>> > fa ce/ itfunc.c [0]PETSC ERROR: #8 main() line 324 in 
>>>> > /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examp
>>>> > le
>>>> > s/t
>>>> > utorials/ex10.c
>>>> >
>>>> > FYI, the transpose solve is a typical application for adjoint 
>>>> > optimization. There should be a big adjoint community of 
>>>> > developers that try to solve the transposed matrix.
>>>> >
>>>> > Any help is much appreciated,
>>>> > Best,
>>>> > Antoine
>>>> >
>>>> >
>>>> > Antoine DeBlois
>>>> > Specialiste ingenierie, MDO lead / Engineering Specialist, MDO 
>>>> > lead Aéronautique / Aerospace 514-855-5001, x 50862 
>>>> > antoine.deblois at aero.bombardier.com<mailto:antoine.deblois at aero.b
>>>> > om
>>>> > bar
>>>> > dier.com>
>>>> >
>>>> > 2351 Blvd Alfred-Nobel
>>>> > Montreal, Qc
>>>> > H4S 1A9
>>>> >
>>>> > [Description : Description :
>>>> > http://signatures.ca.aero.bombardier.net/eom_logo_164x39_fr.jpg]
>>>> > CONFIDENTIALITY NOTICE - This communication may contain 
>>>> > privileged or confidential information.
>>>> > If you are not the intended recipient or received this 
>>>> > communication by error, please notify the sender and delete the 
>>>> > message without copying
>>>
>>>


More information about the petsc-users mailing list