[petsc-dev] Petsc "make test" have more failures for --with-openmp=1

Eric Chamberland Eric.Chamberland at giref.ulaval.ca
Tue Mar 2 14:14:40 CST 2021


Hi,

It all started when I wanted to test PETSC/CUDA compatibility for our code.

I had to activate --with-openmp to configure with --with-cuda=1 
successfully.

I then saw that PETSC_HAVE_OPENMP  is used at least in MUMPS (and some 
other places).

So, I configured and tested petsc with openmp activated, without CUDA.

The first thing I see is that our code CI pipelines now fails for many 
tests.

After looking deeper, it seems that PETSc itself fails many tests when I 
activate openmp!

Here are all the configurations I have results for, after/before 
activating OpenMP for PETSc:

==============================================================================

==============================================================================

For petsc/master + OpenMPI 4.0.4 + MKL 2019.4.243:

With OpenMP=1

https://giref.ulaval.ca/~cmpgiref/petsc-master-debug/2021.03.02.02h00m02s_make_test.log

https://giref.ulaval.ca/~cmpgiref/petsc-master-debug/2021.03.02.02h00m02s_configure.log

# -------------
#   Summary
# -------------
# FAILED snes_tutorials-ex12_quad_hpddm_reuse_baij diff-ksp_ksp_tests-ex33_superlu_dist_2 diff-ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-0_conv-0 diff-ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-0_conv-1 diff-ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-1_conv-0 diff-ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-1_conv-1 diff-ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-0_conv-0 diff-ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-0_conv-1 diff-ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-1_conv-0 diff-ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-1_conv-1 ksp_ksp_tutorials-ex50_tut_2 diff-ksp_ksp_tests-ex33_superlu_dist diff-snes_tutorials-ex56_hypre snes_tutorials-ex17_3d_q3_trig_elas snes_tutorials-ex12_quad_hpddm_reuse_threshold_baij ksp_ksp_tutorials-ex5_superlu_dist_3 ksp_ksp_tutorials-ex5f_superlu_dist snes_tutorials-ex12_tri_parmetis_hpddm_baij diff-snes_tutorials-ex19_tut_3 mat_tests-ex242_3 snes_tutorials-ex17_3d_q3_trig_vlap ksp_ksp_tutorials-ex5f_superlu_dist_3 snes_tutorials-ex19_superlu_dist diff-snes_tutorials-ex56_attach_mat_nearnullspace-1_bddc_approx_hypre diff-ksp_ksp_tutorials-ex49_hypre_nullspace ts_tutorials-ex18_p1p1_xper_ref ts_tutorials-ex18_p1p1_xyper_ref snes_tutorials-ex19_superlu_dist_2 ksp_ksp_tutorials-ex5_superlu_dist_2 diff-snes_tutorials-ex56_attach_mat_nearnullspace-0_bddc_approx_hypre ksp_ksp_tutorials-ex64_1 ksp_ksp_tutorials-ex5_superlu_dist ksp_ksp_tutorials-ex5f_superlu_dist_2
# success 8275/10003 tests (82.7%)
#*failed 33/10003*  tests (0.3%)

With OpenMP=0

https://giref.ulaval.ca/~cmpgiref/petsc-master-debug/2021.02.26.02h00m16s_make_test.log

https://giref.ulaval.ca/~cmpgiref/petsc-master-debug/2021.02.26.02h00m16s_configure.log

# -------------
#   Summary
# -------------
# FAILED tao_constrained_tutorials-tomographyADMM_6 snes_tutorials-ex17_3d_q3_trig_elas mat_tests-ex242_3 snes_tutorials-ex17_3d_q3_trig_vlap tao_leastsquares_tutorials-tomography_1 tao_constrained_tutorials-tomographyADMM_5
# success 8262/9983 tests (82.8%)
#*failed 6/9983*  tests (0.1%)

==============================================================================

==============================================================================

For OpenMPI 3.1.x/master:

With OpenMP=1:

https://giref.ulaval.ca/~cmpgiref/ompi_3.x/2021.03.01.22h00m01s_make_test.log

https://giref.ulaval.ca/~cmpgiref/ompi_3.x/2021.03.01.22h00m01s_configure.log

# -------------
#   Summary
# -------------
# FAILED mat_tests-ex242_3 mat_tests-ex242_2 diff-mat_tests-ex219f_1 diff-dm_tutorials-ex11f90_1 ksp_ksp_tutorials-ex5_superlu_dist_3 diff-ksp_ksp_tutorials-ex49_hypre_nullspace ksp_ksp_tutorials-ex5f_superlu_dist_3 snes_tutorials-ex17_3d_q3_trig_vlap diff-snes_tutorials-ex56_attach_mat_nearnullspace-1_bddc_approx_hypre diff-snes_tutorials-ex19_tut_3 diff-snes_tutorials-ex56_hypre diff-snes_tutorials-ex56_attach_mat_nearnullspace-0_bddc_approx_hypre tao_leastsquares_tutorials-tomography_1 tao_constrained_tutorials-tomographyADMM_4 tao_constrained_tutorials-tomographyADMM_6 diff-tao_constrained_tutorials-toyf_1
# success 8142/9765 tests (83.4%)
#*failed 16/9765*  tests (0.2%)

With OpenMP=0:

https://giref.ulaval.ca/~cmpgiref/ompi_3.x/2021.02.28.22h00m02s_make_test.log

https://giref.ulaval.ca/~cmpgiref/ompi_3.x/2021.02.28.22h00m02s_configure.log

# -------------
#   Summary
# -------------
# FAILED mat_tests-ex242_3 mat_tests-ex242_2 diff-mat_tests-ex219f_1 diff-dm_tutorials-ex11f90_1 ksp_ksp_tutorials-ex56_2 snes_tutorials-ex17_3d_q3_trig_vlap tao_leastsquares_tutorials-tomography_1 tao_constrained_tutorials-tomographyADMM_4 diff-tao_constrained_tutorials-toyf_1
# success 8151/9767 tests (83.5%)
#*failed 9/9767*  tests (0.1%)

==============================================================================

==============================================================================

For OpenMPI 4.0.x/master:

With OpenMP=1:

https://giref.ulaval.ca/~cmpgiref/ompi_4.x/2021.03.01.20h00m01s_make_test.log

https://giref.ulaval.ca/~cmpgiref/ompi_4.x/2021.03.01.20h00m01s_configure.log

# FAILED snes_tutorials-ex17_3d_q3_trig_elas snes_tutorials-ex19_hypre ksp_ksp_tutorials-ex56_2 tao_leastsquares_tutorials-tomography_1 tao_constrained_tutorials-tomographyADMM_5 mat_tests-ex242_3 ksp_ksp_tutorials-ex55_hypre ksp_ksp_tutorials-ex5_superlu_dist_2 tao_constrained_tutorials-tomographyADMM_6 snes_tutorials-ex56_hypre snes_tutorials-ex56_attach_mat_nearnullspace-0_bddc_approx_hypre ksp_ksp_tutorials-ex5f_superlu_dist_3 ksp_ksp_tutorials-ex34_hyprestruct diff-ksp_ksp_tutorials-ex49_hypre_nullspace snes_tutorials-ex56_attach_mat_nearnullspace-1_bddc_approx_hypre ksp_ksp_tutorials-ex5f_superlu_dist ksp_ksp_tutorials-ex5f_superlu_dist_2 ksp_ksp_tutorials-ex5_superlu_dist snes_tutorials-ex19_tut_3 snes_tutorials-ex19_superlu_dist ksp_ksp_tutorials-ex50_tut_2 snes_tutorials-ex17_3d_q3_trig_vlap ksp_ksp_tutorials-ex5_superlu_dist_3 snes_tutorials-ex19_superlu_dist_2 tao_constrained_tutorials-tomographyADMM_4 ts_tutorials-ex26_2
# success 8125/9753 tests (83.3%)
#*failed 26/9753*  tests (0.3%)

With OpenMP=0

https://giref.ulaval.ca/~cmpgiref/ompi_4.x/2021.02.28.20h00m04s_make_test.log

https://giref.ulaval.ca/~cmpgiref/ompi_4.x/2021.02.28.20h00m04s_configure.log

# FAILED mat_tests-ex242_3
# success 8174/9777 tests (83.6%)
#*failed 1/9777*  tests (0.0%)

==============================================================================

==============================================================================

Is that known and normal?

In all cases, I am using MKL and I suspect it  may come from there... :/

I also saw a second problem, "make test" fails to compile petsc examples 
on older versions of MKL (but that's less important for me, I just 
upgraded to OneAPI to avoid this, but you may want to know):

https://giref.ulaval.ca/~cmpgiref/dernier_ompi/2021.03.02.02h16m01s_make_test.log

https://giref.ulaval.ca/~cmpgiref/dernier_ompi/2021.03.02.02h16m01s_configure.log

Thanks,

Eric

-- 
Eric Chamberland, ing., M. Ing
Professionnel de recherche
GIREF/Université Laval
(418) 656-2131 poste 41 22 42

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20210302/c11a9473/attachment.html>


More information about the petsc-dev mailing list