[petsc-dev] Petsc "make test" have more failures for --with-openmp=1

Pierre Jolivet pierre at joliv.et
Thu Mar 11 01:46:26 CST 2021



> On 11 Mar 2021, at 6:16 AM, Barry Smith <bsmith at petsc.dev> wrote:
> 
> 
>   Eric,
> 
>    Sorry about not being more immediate. We still have this in our active email so you don't need to submit individual issues. We'll try to get to them as soon as we can.

Indeed, I’m still trying to figure this out.
I realized that some of my configure flags were different than yours, e.g., no --with-memalign.
I’ve also added SuperLU_DIST to my installation.
Still, I can’t reproduce any issue.
I will continue looking into this, it appears I’m seeing some valgrind errors, but I don’t know if this is some side effect of OpenMPI not being valgrind-clean (last time I checked, there was no error with MPICH).

Thank you for your patience,
Pierre

/usr/bin/gmake -f gmakefile test test-fail=1
Using MAKEFLAGS: test-fail=1
        TEST arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex12_quad_hpddm_reuse_baij.counts
 ok snes_tutorials-ex12_quad_hpddm_reuse_baij
 ok diff-snes_tutorials-ex12_quad_hpddm_reuse_baij
        TEST arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tests-ex33_superlu_dist_2.counts
 ok ksp_ksp_tests-ex33_superlu_dist_2
 ok diff-ksp_ksp_tests-ex33_superlu_dist_2
        TEST arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tests-ex49_superlu_dist.counts
 ok ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-0_conv-0
 ok diff-ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-0_conv-0
 ok ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-0_conv-1
 ok diff-ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-0_conv-1
 ok ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-1_conv-0
 ok diff-ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-1_conv-0
 ok ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-1_conv-1
 ok diff-ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-1_conv-1
 ok ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-0_conv-0
 ok diff-ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-0_conv-0
 ok ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-0_conv-1
 ok diff-ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-0_conv-1
 ok ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-1_conv-0
 ok diff-ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-1_conv-0
 ok ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-1_conv-1
 ok diff-ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-1_conv-1
        TEST arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex50_tut_2.counts
 ok ksp_ksp_tutorials-ex50_tut_2
 ok diff-ksp_ksp_tutorials-ex50_tut_2
        TEST arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tests-ex33_superlu_dist.counts
 ok ksp_ksp_tests-ex33_superlu_dist
 ok diff-ksp_ksp_tests-ex33_superlu_dist
        TEST arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex56_hypre.counts
 ok snes_tutorials-ex56_hypre
 ok diff-snes_tutorials-ex56_hypre
        TEST arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex56_2.counts
 ok ksp_ksp_tutorials-ex56_2
 ok diff-ksp_ksp_tutorials-ex56_2
        TEST arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex17_3d_q3_trig_elas.counts
 ok snes_tutorials-ex17_3d_q3_trig_elas
 ok diff-snes_tutorials-ex17_3d_q3_trig_elas
        TEST arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex12_quad_hpddm_reuse_threshold_baij.counts
 ok snes_tutorials-ex12_quad_hpddm_reuse_threshold_baij
 ok diff-snes_tutorials-ex12_quad_hpddm_reuse_threshold_baij
        TEST arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex5_superlu_dist_3.counts
not ok ksp_ksp_tutorials-ex5_superlu_dist_3 # Error code: 1
#	srun: error: Unable to create step for job 1426755: More processors requested than permitted
 ok ksp_ksp_tutorials-ex5_superlu_dist_3 # SKIP Command failed so no diff
        TEST arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex5f_superlu_dist.counts
 ok ksp_ksp_tutorials-ex5f_superlu_dist # SKIP Fortran required for this test
        TEST arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex12_tri_parmetis_hpddm_baij.counts
 ok snes_tutorials-ex12_tri_parmetis_hpddm_baij
 ok diff-snes_tutorials-ex12_tri_parmetis_hpddm_baij
        TEST arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex19_tut_3.counts
 ok snes_tutorials-ex19_tut_3
 ok diff-snes_tutorials-ex19_tut_3
        TEST arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex17_3d_q3_trig_vlap.counts
 ok snes_tutorials-ex17_3d_q3_trig_vlap
 ok diff-snes_tutorials-ex17_3d_q3_trig_vlap
        TEST arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex5f_superlu_dist_3.counts
 ok ksp_ksp_tutorials-ex5f_superlu_dist_3 # SKIP Fortran required for this test
        TEST arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex19_superlu_dist.counts
 ok snes_tutorials-ex19_superlu_dist
 ok diff-snes_tutorials-ex19_superlu_dist
        TEST arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex56_attach_mat_nearnullspace-1_bddc_approx_hypre.counts
 ok snes_tutorials-ex56_attach_mat_nearnullspace-1_bddc_approx_hypre
 ok diff-snes_tutorials-ex56_attach_mat_nearnullspace-1_bddc_approx_hypre
        TEST arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex49_hypre_nullspace.counts
 ok ksp_ksp_tutorials-ex49_hypre_nullspace
 ok diff-ksp_ksp_tutorials-ex49_hypre_nullspace
        TEST arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex19_superlu_dist_2.counts
 ok snes_tutorials-ex19_superlu_dist_2
 ok diff-snes_tutorials-ex19_superlu_dist_2
        TEST arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex5_superlu_dist_2.counts
not ok ksp_ksp_tutorials-ex5_superlu_dist_2 # Error code: 1
#	srun: error: Unable to create step for job 1426755: More processors requested than permitted
 ok ksp_ksp_tutorials-ex5_superlu_dist_2 # SKIP Command failed so no diff
        TEST arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex56_attach_mat_nearnullspace-0_bddc_approx_hypre.counts
 ok snes_tutorials-ex56_attach_mat_nearnullspace-0_bddc_approx_hypre
 ok diff-snes_tutorials-ex56_attach_mat_nearnullspace-0_bddc_approx_hypre
        TEST arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex64_1.counts
 ok ksp_ksp_tutorials-ex64_1
 ok diff-ksp_ksp_tutorials-ex64_1
        TEST arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex5_superlu_dist.counts
not ok ksp_ksp_tutorials-ex5_superlu_dist # Error code: 1
#	srun: error: Unable to create step for job 1426755: More processors requested than permitted
 ok ksp_ksp_tutorials-ex5_superlu_dist # SKIP Command failed so no diff
        TEST arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex5f_superlu_dist_2.counts
 ok ksp_ksp_tutorials-ex5f_superlu_dist_2 # SKIP Fortran required for this test

>    Barry
> 
> 
>> On Mar 10, 2021, at 11:03 PM, Eric Chamberland <Eric.Chamberland at giref.ulaval.ca <mailto:Eric.Chamberland at giref.ulaval.ca>> wrote:
>> 
>> Barry,
>> 
>> to get a some follow up on --with-openmp=1 failures, shall I open gitlab issues for:
>> 
>> a) all hypre failures giving DIVERGED_INDEFINITE_PC
>> 
>> b) all superlu_dist failures giving different results with initia and "Exceeded timeout limit of 60 s"
>> 
>> c) hpddm failures "free(): invalid next size (fast)" and "Segmentation Violation"
>> 
>> d) all tao's "Exceeded timeout limit of 60 s"
>> 
>> I don't see how I could do all these debugging by myself...
>> 
>> Thanks,
>> 
>> Eric
>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20210311/46861cc0/attachment-0001.html>


More information about the petsc-dev mailing list