[petsc-users] Troubles updating my code from PETSc-3.4 to 3.5 Using MUMPS for KSPSolve()
Marc MEDALE
marc.medale at univ-amu.fr
Tue Dec 16 02:55:14 CST 2014
Dear Barry and Matt,
Since the outputs from ksp_view and kip_monitor where not very helpful, I come back to you with results from more detailed tests on the solution a very ill-conditionned algebraic system solved in parallel with KSPSolve and MUMPS direct solver.
1) I have dumped into binary files both the assembled matrix and rhs computed with the two versions of my research code (PETSc-3.4p4 and 3.5p1). The respective files are:: Mat_bin_3.4p4, RHS_bin_3.4p4; Mat_bin_3.5p1, RHS_bin_3.5p1;
2) To prevent from any question refering to a possible bug in my own code upgrade I have run /src/ksp/ksp/examples/tutorials/ex10 (slightly modified to compute the L2 norm of the solution vector, attached to this e-mail) on 40 cores with the two PETSc versions and the combination of Mat and Rhs, with the following command line options:
-f0 Mat_bin_3.5p1 -f1 Mat_bin_3.4p4 -rhs RHS_bin_3.4p4 -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package mumps -mat_mumps_icntl_8 0 -mat_type mpiaij -vec_type mpi -options_left
and :
-f0 Mat_bin_3.5p1 -f1 Mat_bin_3.4p4 -rhs RHS_bin_3.5p1 -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package mumps -mat_mumps_icntl_8 0 -mat_type mpiaij -vec_type mpi -options_left
3) Results provided below compare the outputs obtained by a diff on respective output files (all the four output files are attached to this e-mail):
a) ex10-PETSs-3.5p1, running with the various binary matrix files and rhs files:
diff Test_ex10-3.5p1_rhs-3.5p1.out Test_ex10-3.5p1_rhs-3.4p4.out
2c2
< Residual norm 1.66855e-08
---
> Residual norm 1.66813e-08
5c5
< Residual norm 1.6675e-08
---
> Residual norm 1.66699e-08
16c16
< -rhs RHS_bin_3.5p1
---
> -rhs RHS_bin_3.4p4
b) ex10-PETSs-3.5p1 versus ex10-PETSs-3.4p4, with the various binary matrix files and rhs files:
diff Test_ex10-3.5p1_rhs-3.5p1.out Test_ex10-3.4p4_rhs-3.5p1.out
2,3c2,3
< Residual norm 1.66855e-08
< Solution norm 0.0161289
---
> Residual norm 2.89642e-08
> Solution norm 0.0731946
5,6c5,6
< Residual norm 1.6675e-08
< Solution norm 0.0161289
---
> Residual norm 2.89849e-08
> Solution norm 0.0732001
4) Analysis:
- Test a) and its symmetric (undertaken with ex10-PETSc-3.4p1 demonstrate that the two matrices and two Rhs computed with the two PETSc versions are identical: they produce the same solution vector and comparable residuals, up to the numerical accuracy when solving such ill-conditionned algebraic systems (condition number of order of 1e9, that the reason I use the MUMPS direct solver);
- Test b) and its symmetric (undertaken with the rhs computed with PETSc-3.4p4) show that a very different solution vector (more than 4 times difference in the L2 norm) is obtained when solving the algebraic system with ex10-3.5p1 and ex10-3.4p4, both with MUMPS-4.10.0 and the same command line options, whereas the residuals are quite different but only twice. The first two lines below refer to the former calculation and the last two lines refer to the latter one:
< Residual norm 1.66855e-08
< Solution norm 0.0161289
---
> Residual norm 2.89642e-08
> Solution norm 0.0731946
5) Questions:
- Do any default values in the PETSc-MUMPS interface have been changed from PETSc-3.4 to 3.5?
- What is going wrong?
If you have some time to play on your side with the binary files (matrices and rhs), I would be pleased to provide them to you, just let me know where to drop them. Their weight is approx 775 Mo for each mat and 16 Mo for each rhs.
Thank you for you help to overcome this crazy problem.
Best regards.
Marc MEDALE
Le 11 déc. 2014 à 18:01, Barry Smith <bsmith at mcs.anl.gov> a écrit :
>
> Please run both with -ksp_monitor -ksp_type gmres and send the output
>
> Barry
>
>> On Dec 11, 2014, at 10:07 AM, Marc MEDALE <marc.medale at univ-amu.fr> wrote:
>>
>> Dear Matt,
>>
>> the output files obtained with PETSc-3.4p4 and 3.5p1 versions using the following command line:
>> -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package mumps -mat_mumps_icntl_8 0 -ksp_monitor -ksp_view
>>
>> are attached below. If skipping flops and memory usage per core, a diff between the two output files reduces to:
>> diff Output_3.4p4.txt Output_3.5p1.txt
>> 14c14
>> < Matrix Object: 64 MPI processes
>> ---
>>> Mat Object: 64 MPI processes
>> 18c18
>> < total: nonzeros=481059588, allocated nonzeros=481059588
>> ---
>>> total: nonzeros=4.8106e+08, allocated nonzeros=4.8106e+08
>> 457c457
>> < INFOG(10) (total integer space store the matrix factors after factorization): 26149876
>> ---
>>> INFOG(10) (total integer space store the matrix factors after factorization): 26136333
>> 461c461
>> < INFOG(14) (number of memory compress after factorization): 54
>> ---
>>> INFOG(14) (number of memory compress after factorization): 48
>> 468,469c468,469
>> < INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 338
>> < INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 19782
>> ---
>>> INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 334
>>> INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 19779
>> 472a473,478
>>> INFOG(28) (after factorization: number of null pivots encountered): 0
>>> INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 470143172
>>> INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 202, 10547
>>> INFOG(32) (after analysis: type of analysis done): 1
>>> INFOG(33) (value used for ICNTL(8)): 0
>>> INFOG(34) (exponent of the determinant if determinant is requested): 0
>> 474c480
>> < Matrix Object: 64 MPI processes
>> ---
>>> Mat Object: 64 MPI processes
>> 477c483
>> < total: nonzeros=63720324, allocated nonzeros=63720324
>> ---
>>> total: nonzeros=6.37203e+07, allocated nonzeros=6.37203e+07
>> 481c487
>> < Norme de U 1 7.37266E-02, L 1 1.00000E+00
>> ---
>>> Norme de U 1 1.61172E-02, L 1 1.00000E+00
>> 483c489
>> < Temps total d execution : 198.373291969299
>> ---
>>> Temps total d execution : 216.934082031250
>>
>>
>> Which does not reveal any striking differences, except in the L2 norm of the solution vectors.
>>
>> I need assistance to help me to overcome this quite bizarre behavior.
>>
>> Thank you.
>>
>> Marc MEDALE
>>
>> =========================================================
>> Université Aix-Marseille, Polytech'Marseille, Dépt Mécanique Energétique
>> Laboratoire IUSTI, UMR 7343 CNRS-Université Aix-Marseille
>> Technopole de Chateau-Gombert, 5 rue Enrico Fermi
>> 13453 MARSEILLE, Cedex 13, FRANCE
>> ---------------------------------------------------------------------------------------------------
>> Tel : +33 (0)4.91.10.69.14 ou 38
>> Fax : +33 (0)4.91.10.69.69
>> e-mail : marc.medale at univ-amu.fr
>> =========================================================
>>
>>
>>
>>
>>
>>
>>
>> Le 11 déc. 2014 à 11:43, Matthew Knepley <knepley at gmail.com> a écrit :
>>
>>> On Thu, Dec 11, 2014 at 4:38 AM, Marc MEDALE <marc.medale at univ-amu.fr> wrote:
>>> Dear PETSC Users,
>>>
>>> I have just updated to PETSc-3.5 my research code that uses PETSc for a while but I'm facing an astonishing difference between PETSc-3.4 to 3.5 versions when solving a very ill conditioned algebraic system with MUMPS (4.10.0 in both cases).
>>>
>>> The only differences that arise in my fortran source code are the following:
>>> Loma1-medale% diff ../version_3.5/solvEFL_MAN_SBIF.F ../version_3.4/solvEFL_MAN_SBIF.F
>>> 336,337d335
>>> < CALL MatSetOption(MATGLOB,MAT_KEEP_NONZERO_PATTERN,
>>> < & PETSC_TRUE,IER)
>>> 749,750c747,748
>>> < CALL KSPSetTolerances(KSP1,TOL,PETSC_DEFAULT_REAL,
>>> < & PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,IER)
>>> ---
>>>> CALL KSPSetTolerances(KSP1,TOL,PETSC_DEFAULT_DOUBLE_PRECISION,
>>>> & PETSC_DEFAULT_DOUBLE_PRECISION,PETSC_DEFAULT_INTEGER,IER)
>>> 909c907,908
>>> < CALL KSPSetOperators(KSP1,MATGLOB,MATGLOB,IER)
>>> ---
>>>> CALL KSPSetOperators(KSP1,MATGLOB,MATGLOB,
>>>> & SAME_NONZERO_PATTERN,IER)
>>>
>>> When I run the corresponding program versions on 128 cores of our cluster with the same input data and the following command line arguments:
>>> -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package mumps -mat_mumps_icntl_8 0
>>>
>>> I get the following outputs:
>>> a) with PETSc-3.4p4:
>>> L2 norm of solution vector: 7.39640E-02,
>>>
>>> b) with PETSc-3.5p1:
>>> L2 norm of solution vector: 1.61325E-02
>>>
>>> Do I have change something else in updating my code based on KSP from PETSc-3.4 to 3.5 versions?
>>> Do any default values in the PETSc-MUMPS interface have been changed from PETSc-3.4 to 3.5?
>>> Any hints or suggestions are welcome to help me to recover the right results (obtained with PETSc-3.4).
>>>
>>> Send the output from -ksp_monitor -ksp_view for both runs. I am guessing that a MUMPS default changed between versions.
>>>
>>> Thanks,
>>>
>>> Matt
>>>
>>> Thank you very much.
>>>
>>> Marc MEDALE.
>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>> -- Norbert Wiener
>>
>> <Output_3.4p4.txt><Output_3.5p1.txt>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20141216/af2fad3d/attachment-0006.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex10.c
Type: application/octet-stream
Size: 18867 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20141216/af2fad3d/attachment-0005.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20141216/af2fad3d/attachment-0007.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Test_ex10-3.4p4_rhs-3.4p4.out
Type: application/octet-stream
Size: 434 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20141216/af2fad3d/attachment-0006.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20141216/af2fad3d/attachment-0008.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Test_ex10-3.4p4_rhs-3.5p1.out
Type: application/octet-stream
Size: 434 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20141216/af2fad3d/attachment-0007.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20141216/af2fad3d/attachment-0009.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Test_ex10-3.5p1_rhs-3.4p4.out
Type: application/octet-stream
Size: 434 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20141216/af2fad3d/attachment-0008.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20141216/af2fad3d/attachment-0010.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Test_ex10-3.5p1_rhs-3.5p1.out
Type: application/octet-stream
Size: 433 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20141216/af2fad3d/attachment-0009.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20141216/af2fad3d/attachment-0011.html>
More information about the petsc-users
mailing list