[petsc-dev] Petsc "make test" have more failures for --with-openmp=1
Eric Chamberland
Eric.Chamberland at giref.ulaval.ca
Wed Mar 17 10:23:04 CDT 2021
Thanks Barry,
Just to report:
I tried to switch to the proposed smoother by default in our code:
-pc_hypre_boomeramg_relax_type_all l1scaled-SOR/Jacobi
However, I have some failures, even if I compiled without --with-openmp=1.
[0]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
[0]PETSC ERROR: Error in external library
[0]PETSC ERROR: Error in jac->setup(): error code 12
[0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html
for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.14.5, Mar 03, 2021
[0]PETSC ERROR:
/home/mefpp_ericc/GIREF/bin/Test.EstimationGradientHessien.dev on a
named rohan by ericc Wed Mar 17 11:05:23 2021
[0]PETSC ERROR: Configure options
--prefix=/opt/petsc-3.14.5_debug_openmpi-4.1.0 --with-mpi-compilers=1
--with-mpi-dir=/opt/openmpi-4.1.0 --with-cxx-dialect=C++14
--with-make-np=12 --with-shared-libraries=1 --with-debugging=yes
--with-memalign=64 --with-visibility=0 --with-64-bit-indices=0
--download-ml=yes --download-mumps=yes --download-superlu=yes
--download-hpddm=yes --download-slepc=yes --download-superlu_dist=yes
--download-parmetis=yes --download-ptscotch=yes --download-metis=yes
--download-strumpack=yes --download-suitesparse=yes --download-hypre=yes
--with-blaslapack-dir=/opt/intel/oneapi/mkl/2021.1.1/env/../lib/intel64
--with-mkl_pardiso-dir=/opt/intel/oneapi/mkl/2021.1.1/env/..
--with-mkl_cpardiso-dir=/opt/intel/oneapi/mkl/2021.1.1/env/..
--with-scalapack=1
--with-scalapack-include=/opt/intel/oneapi/mkl/2021.1.1/env/../include
--with-scalapack-lib="-L/opt/intel/oneapi/mkl/2021.1.1/env/../lib/intel64
-lmkl_scalapack_lp64 -lmkl_blacs_openmpi_lp64"
[0]PETSC ERROR: #1 PCSetUp_HYPRE() line 408 in
/tmp/petsc-3.14.5-debug/src/ksp/pc/impls/hypre/hypre.c
[0]PETSC ERROR: #2 PCSetUp() line 1009 in
/tmp/petsc-3.14.5-debug/src/ksp/pc/interface/precon.c
[0]PETSC ERROR: #3 KSPSetUp() line 406 in
/tmp/petsc-3.14.5-debug/src/ksp/ksp/interface/itfunc.c
But it seems to happen only on some cases, actually hermitian elements
which have a lots of DOF per vertices... It seems to work well
otherwise, with some results differences I still have to analyse...
Do you think this might be a PETSc bug?
Does the error code is from PETSc or Hypre?
(if from hypre, I suggest to say "Hypre error code: 12" instead...)
Thanks,
Eric
On 2021-03-15 2:50 p.m., Barry Smith wrote:
>
> I posted some information at the issue.
>
> IMHO it is likely a bug in one or more of hypre's smoothers that
> use OpenMP. We have never tested them before (and likely hypre has not
> tested all the combinations) and so would not have seen the bug.
> Hopefully they can just fix it.
>
> Barry
>
> I got the problem to occur with ex56 with 2 MPI ranks and 4 OpenMP
> threads, if I used less than 4 threads it did not generate an
> indefinite preconditioner.
>
>
>> On Mar 14, 2021, at 1:18 PM, Eric Chamberland
>> <Eric.Chamberland at giref.ulaval.ca
>> <mailto:Eric.Chamberland at giref.ulaval.ca>> wrote:
>>
>> Done:
>>
>> https://github.com/hypre-space/hypre/issues/303
>>
>> Maybe I will need some help about PETSc to answer their questions...
>>
>> Eric
>>
>> On 2021-03-14 3:44 a.m., Stefano Zampini wrote:
>>> Eric
>>>
>>> You should report these HYPRE issues upstream
>>> https://github.com/hypre-space/hypre/issues
>>> <https://github.com/hypre-space/hypre/issues>
>>>
>>>
>>>> On Mar 14, 2021, at 3:44 AM, Eric Chamberland
>>>> <Eric.Chamberland at giref.ulaval.ca
>>>> <mailto:Eric.Chamberland at giref.ulaval.ca>> wrote:
>>>>
>>>> For us it clearly creates problems in real computations...
>>>>
>>>> I understand the need to have clean test for PETSc, but for me, it
>>>> reveals that hypre isn't usable with more than one thread for now...
>>>>
>>>> Another solution: force single-threaded configuration for hypre
>>>> until this is fixed?
>>>>
>>>> Eric
>>>>
>>>> On 2021-03-13 8:50 a.m., Pierre Jolivet wrote:
>>>>> -pc_hypre_boomeramg_relax_type_all Jacobi =>
>>>>> Linear solve did not converge due to DIVERGED_INDEFINITE_PC
>>>>> iterations 3
>>>>> -pc_hypre_boomeramg_relax_type_all l1scaled-Jacobi =>
>>>>> OK, independently of the architecture it seems (Eric Docker image
>>>>> with 1 or 2 threads or my macOS), but contraction factor is higher
>>>>> Linear solve converged due to CONVERGED_RTOL iterations 8
>>>>> Linear solve converged due to CONVERGED_RTOL iterations 24
>>>>> Linear solve converged due to CONVERGED_RTOL iterations 26
>>>>> v. currently
>>>>> Linear solve converged due to CONVERGED_RTOL iterations 7
>>>>> Linear solve converged due to CONVERGED_RTOL iterations 9
>>>>> Linear solve converged due to CONVERGED_RTOL iterations 10
>>>>>
>>>>> Do we change this? Or should we force OMP_NUM_THREADS=1 for make test?
>>>>>
>>>>> Thanks,
>>>>> Pierre
>>>>>
>>>>>> On 13 Mar 2021, at 2:26 PM, Mark Adams <mfadams at lbl.gov
>>>>>> <mailto:mfadams at lbl.gov>> wrote:
>>>>>>
>>>>>> Hypre uses a multiplicative smoother by default. It has a
>>>>>> chebyshev smoother. That with a Jacobi PC should be thread
>>>>>> invariant.
>>>>>> Mark
>>>>>>
>>>>>> On Sat, Mar 13, 2021 at 8:18 AM Pierre Jolivet <pierre at joliv.et
>>>>>> <mailto:pierre at joliv.et>> wrote:
>>>>>>
>>>>>>
>>>>>>> On 13 Mar 2021, at 9:17 AM, Pierre Jolivet <pierre at joliv.et
>>>>>>> <mailto:pierre at joliv.et>> wrote:
>>>>>>>
>>>>>>> Hello Eric,
>>>>>>> I’ve made an “interesting” discovery, so I’ll put back the
>>>>>>> list in c/c.
>>>>>>> It appears the following snippet of code which uses
>>>>>>> Allreduce() + lambda function + MPI_IN_PLACE is:
>>>>>>> - Valgrind-clean with MPICH;
>>>>>>> - Valgrind-clean with OpenMPI 4.0.5;
>>>>>>> - not Valgrind-clean with OpenMPI 4.1.0.
>>>>>>> I’m not sure who is to blame here, I’ll need to look at the
>>>>>>> MPI specification for what is required by the implementors
>>>>>>> and users in that case.
>>>>>>>
>>>>>>> In the meantime, I’ll do the following:
>>>>>>> - update config/BuildSystem/config/packages/OpenMPI.py to
>>>>>>> use OpenMPI 4.1.0, see if any other error appears;
>>>>>>> - provide a hotfix to bypass the segfaults;
>>>>>>
>>>>>> I can confirm that splitting the single Allreduce with my own
>>>>>> MPI_Op into two Allreduce with MAX and BAND fixes the
>>>>>> segfaults with OpenMPI (*).
>>>>>>
>>>>>>> - look at the hypre issue and whether they should be
>>>>>>> deferred to the hypre team.
>>>>>>
>>>>>> I don’t know if there is something wrong in hypre threading
>>>>>> or if it’s just a side effect of threading, but it seems that
>>>>>> the number of threads has a drastic effect on the quality of
>>>>>> the PC.
>>>>>> By default, it looks that there are two threads per process
>>>>>> with your Docker image.
>>>>>> If I force OMP_NUM_THREADS=1, then I get the same convergence
>>>>>> as in the output file.
>>>>>>
>>>>>> Thanks,
>>>>>> Pierre
>>>>>>
>>>>>> (*) https://gitlab.com/petsc/petsc/-/merge_requests/3712
>>>>>> <https://gitlab.com/petsc/petsc/-/merge_requests/3712>
>>>>>>
>>>>>>> Thank you for the Docker files, they were really useful.
>>>>>>> If you want to avoid oversubscription failures, you can edit
>>>>>>> the file /opt/openmpi-4.1.0/etc/openmpi-default-hostfile and
>>>>>>> append the line:
>>>>>>> localhost slots=12
>>>>>>> If you want to increase the timeout limit of PETSc test
>>>>>>> suite for each test, you can add the extra flag in your
>>>>>>> command line TIMEOUT=180 (default is 60, units are seconds).
>>>>>>>
>>>>>>> Thanks, I’ll ping you on GitLab when I’ve got something
>>>>>>> ready for you to try,
>>>>>>> Pierre
>>>>>>>
>>>>>>> <ompi.cxx>
>>>>>>>
>>>>>>>> On 12 Mar 2021, at 8:54 PM, Eric Chamberland
>>>>>>>> <Eric.Chamberland at giref.ulaval.ca
>>>>>>>> <mailto:Eric.Chamberland at giref.ulaval.ca>> wrote:
>>>>>>>>
>>>>>>>> Hi Pierre,
>>>>>>>>
>>>>>>>> I now have a docker container reproducing the problems here.
>>>>>>>>
>>>>>>>> Actually, if I look at
>>>>>>>> snes_tutorials-ex12_quad_singular_hpddm it fails like this:
>>>>>>>>
>>>>>>>> not ok snes_tutorials-ex12_quad_singular_hpddm # Error code: 59
>>>>>>>> # Initial guess
>>>>>>>> # L_2 Error: 0.00803099
>>>>>>>> # Initial Residual
>>>>>>>> # L_2 Residual: 1.09057
>>>>>>>> # Au - b = Au + F(0)
>>>>>>>> # Linear L_2 Residual: 1.09057
>>>>>>>> # [d470c54ce086:14127] Read -1, expected 4096, errno = 1
>>>>>>>> # [d470c54ce086:14128] Read -1, expected 4096, errno = 1
>>>>>>>> # [d470c54ce086:14129] Read -1, expected 4096, errno = 1
>>>>>>>> # [3]PETSC ERROR:
>>>>>>>> ------------------------------------------------------------------------
>>>>>>>> # [3]PETSC ERROR: Caught signal number 11 SEGV:
>>>>>>>> Segmentation Violation, probably memory access out of range
>>>>>>>> # [3]PETSC ERROR: Try option -start_in_debugger or
>>>>>>>> -on_error_attach_debugger
>>>>>>>> # [3]PETSC ERROR: or see
>>>>>>>> https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>>>>>>> <https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind>
>>>>>>>> # [3]PETSC ERROR: or try http://valgrind.org
>>>>>>>> <http://valgrind.org/> on GNU/linux and Apple Mac OS X to
>>>>>>>> find memory corruption errors
>>>>>>>> # [3]PETSC ERROR: likely location of problem given in stack
>>>>>>>> below
>>>>>>>> # [3]PETSC ERROR: --------------------- Stack Frames
>>>>>>>> ------------------------------------
>>>>>>>> # [3]PETSC ERROR: Note: The EXACT line numbers in the stack
>>>>>>>> are not available,
>>>>>>>> # [3]PETSC ERROR: INSTEAD the line number of the start of
>>>>>>>> the function
>>>>>>>> # [3]PETSC ERROR: is given.
>>>>>>>> # [3]PETSC ERROR: [3] buildTwo line 987
>>>>>>>> /opt/petsc-main/include/HPDDM_schwarz.hpp
>>>>>>>> # [3]PETSC ERROR: [3] next line 1130
>>>>>>>> /opt/petsc-main/include/HPDDM_schwarz.hpp
>>>>>>>> # [3]PETSC ERROR: --------------------- Error Message
>>>>>>>> --------------------------------------------------------------
>>>>>>>> # [3]PETSC ERROR: Signal received
>>>>>>>> # [3]PETSC ERROR: [0]PETSC ERROR:
>>>>>>>> ------------------------------------------------------------------------
>>>>>>>>
>>>>>>>> also ex12_quad_hpddm_reuse_baij fails with a lot more "Read
>>>>>>>> -1, expected ..." which I don't know where they come from...?
>>>>>>>>
>>>>>>>> Hypre (like in diff-snes_tutorials-ex56_hypre) is also
>>>>>>>> having DIVERGED_INDEFINITE_PC failures...
>>>>>>>>
>>>>>>>> Please see the 3 attached docker files:
>>>>>>>>
>>>>>>>> 1) fedora_mkl_and_devtools : the DockerFile which install
>>>>>>>> fedore 33 with gnu compilers and MKL and everything to develop.
>>>>>>>>
>>>>>>>> 2) openmpi: the DockerFile to bluid OpenMPI
>>>>>>>>
>>>>>>>> 3) petsc: The las DockerFile that build/install and test PETSc
>>>>>>>>
>>>>>>>> I build the 3 like this:
>>>>>>>>
>>>>>>>> docker build -t fedora_mkl_and_devtools -f
>>>>>>>> fedora_mkl_and_devtools .
>>>>>>>>
>>>>>>>> docker build -t openmpi -f openmpi .
>>>>>>>>
>>>>>>>> docker build -t petsc -f petsc .
>>>>>>>>
>>>>>>>> Disclaimer: I am not a docker expert, so I may do things
>>>>>>>> that are not docker-stat-of-the-art but I am opened to
>>>>>>>> suggestions... ;)
>>>>>>>>
>>>>>>>> I have just ran it on my portable (long) which have not
>>>>>>>> enough cores, so many more tests failed (should force
>>>>>>>> --oversubscribe but don't know how to). I will relaunch on
>>>>>>>> my workstation in a few minutes.
>>>>>>>>
>>>>>>>> I will now test your branch! (sorry for the delay).
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Eric
>>>>>>>>
>>>>>>>> On 2021-03-11 9:03 a.m., Eric Chamberland wrote:
>>>>>>>>>
>>>>>>>>> Hi Pierre,
>>>>>>>>>
>>>>>>>>> ok, that's interesting!
>>>>>>>>>
>>>>>>>>> I will try to build a docker image until tomorrow and give
>>>>>>>>> you the exact recipe to reproduce the bugs.
>>>>>>>>>
>>>>>>>>> Eric
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2021-03-11 2:46 a.m., Pierre Jolivet wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> On 11 Mar 2021, at 6:16 AM, Barry Smith
>>>>>>>>>>> <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Eric,
>>>>>>>>>>>
>>>>>>>>>>> Sorry about not being more immediate. We still have
>>>>>>>>>>> this in our active email so you don't need to submit
>>>>>>>>>>> individual issues. We'll try to get to them as soon as
>>>>>>>>>>> we can.
>>>>>>>>>>
>>>>>>>>>> Indeed, I’m still trying to figure this out.
>>>>>>>>>> I realized that some of my configure flags were different
>>>>>>>>>> than yours, e.g., no --with-memalign.
>>>>>>>>>> I’ve also added SuperLU_DIST to my installation.
>>>>>>>>>> Still, I can’t reproduce any issue.
>>>>>>>>>> I will continue looking into this, it appears I’m seeing
>>>>>>>>>> some valgrind errors, but I don’t know if this is some
>>>>>>>>>> side effect of OpenMPI not being valgrind-clean (last
>>>>>>>>>> time I checked, there was no error with MPICH).
>>>>>>>>>>
>>>>>>>>>> Thank you for your patience,
>>>>>>>>>> Pierre
>>>>>>>>>>
>>>>>>>>>> /usr/bin/gmake -f gmakefile test test-fail=1
>>>>>>>>>> Using MAKEFLAGS: test-fail=1
>>>>>>>>>> TEST
>>>>>>>>>> arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex12_quad_hpddm_reuse_baij.counts
>>>>>>>>>> ok snes_tutorials-ex12_quad_hpddm_reuse_baij
>>>>>>>>>> ok diff-snes_tutorials-ex12_quad_hpddm_reuse_baij
>>>>>>>>>> TEST
>>>>>>>>>> arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tests-ex33_superlu_dist_2.counts
>>>>>>>>>> ok ksp_ksp_tests-ex33_superlu_dist_2
>>>>>>>>>> ok diff-ksp_ksp_tests-ex33_superlu_dist_2
>>>>>>>>>> TEST
>>>>>>>>>> arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tests-ex49_superlu_dist.counts
>>>>>>>>>> ok ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-0_conv-0
>>>>>>>>>> ok diff-ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-0_conv-0
>>>>>>>>>> ok ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-0_conv-1
>>>>>>>>>> ok diff-ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-0_conv-1
>>>>>>>>>> ok ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-1_conv-0
>>>>>>>>>> ok diff-ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-1_conv-0
>>>>>>>>>> ok ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-1_conv-1
>>>>>>>>>> ok diff-ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-1_conv-1
>>>>>>>>>> ok ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-0_conv-0
>>>>>>>>>> ok diff-ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-0_conv-0
>>>>>>>>>> ok ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-0_conv-1
>>>>>>>>>> ok diff-ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-0_conv-1
>>>>>>>>>> ok ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-1_conv-0
>>>>>>>>>> ok diff-ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-1_conv-0
>>>>>>>>>> ok ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-1_conv-1
>>>>>>>>>> ok diff-ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-1_conv-1
>>>>>>>>>> TEST
>>>>>>>>>> arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex50_tut_2.counts
>>>>>>>>>> ok ksp_ksp_tutorials-ex50_tut_2
>>>>>>>>>> ok diff-ksp_ksp_tutorials-ex50_tut_2
>>>>>>>>>> TEST
>>>>>>>>>> arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tests-ex33_superlu_dist.counts
>>>>>>>>>> ok ksp_ksp_tests-ex33_superlu_dist
>>>>>>>>>> ok diff-ksp_ksp_tests-ex33_superlu_dist
>>>>>>>>>> TEST
>>>>>>>>>> arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex56_hypre.counts
>>>>>>>>>> ok snes_tutorials-ex56_hypre
>>>>>>>>>> ok diff-snes_tutorials-ex56_hypre
>>>>>>>>>> TEST
>>>>>>>>>> arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex56_2.counts
>>>>>>>>>> ok ksp_ksp_tutorials-ex56_2
>>>>>>>>>> ok diff-ksp_ksp_tutorials-ex56_2
>>>>>>>>>> TEST
>>>>>>>>>> arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex17_3d_q3_trig_elas.counts
>>>>>>>>>> ok snes_tutorials-ex17_3d_q3_trig_elas
>>>>>>>>>> ok diff-snes_tutorials-ex17_3d_q3_trig_elas
>>>>>>>>>> TEST
>>>>>>>>>> arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex12_quad_hpddm_reuse_threshold_baij.counts
>>>>>>>>>> ok snes_tutorials-ex12_quad_hpddm_reuse_threshold_baij
>>>>>>>>>> ok diff-snes_tutorials-ex12_quad_hpddm_reuse_threshold_baij
>>>>>>>>>> TEST
>>>>>>>>>> arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex5_superlu_dist_3.counts
>>>>>>>>>> not ok ksp_ksp_tutorials-ex5_superlu_dist_3 # Error code: 1
>>>>>>>>>> #srun: error: Unable to create step for job 1426755: More
>>>>>>>>>> processors requested than permitted
>>>>>>>>>> ok ksp_ksp_tutorials-ex5_superlu_dist_3 # SKIP Command
>>>>>>>>>> failed so no diff
>>>>>>>>>> TEST
>>>>>>>>>> arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex5f_superlu_dist.counts
>>>>>>>>>> ok ksp_ksp_tutorials-ex5f_superlu_dist # SKIP Fortran
>>>>>>>>>> required for this test
>>>>>>>>>> TEST
>>>>>>>>>> arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex12_tri_parmetis_hpddm_baij.counts
>>>>>>>>>> ok snes_tutorials-ex12_tri_parmetis_hpddm_baij
>>>>>>>>>> ok diff-snes_tutorials-ex12_tri_parmetis_hpddm_baij
>>>>>>>>>> TEST
>>>>>>>>>> arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex19_tut_3.counts
>>>>>>>>>> ok snes_tutorials-ex19_tut_3
>>>>>>>>>> ok diff-snes_tutorials-ex19_tut_3
>>>>>>>>>> TEST
>>>>>>>>>> arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex17_3d_q3_trig_vlap.counts
>>>>>>>>>> ok snes_tutorials-ex17_3d_q3_trig_vlap
>>>>>>>>>> ok diff-snes_tutorials-ex17_3d_q3_trig_vlap
>>>>>>>>>> TEST
>>>>>>>>>> arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex5f_superlu_dist_3.counts
>>>>>>>>>> ok ksp_ksp_tutorials-ex5f_superlu_dist_3 # SKIP Fortran
>>>>>>>>>> required for this test
>>>>>>>>>> TEST
>>>>>>>>>> arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex19_superlu_dist.counts
>>>>>>>>>> ok snes_tutorials-ex19_superlu_dist
>>>>>>>>>> ok diff-snes_tutorials-ex19_superlu_dist
>>>>>>>>>> TEST
>>>>>>>>>> arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex56_attach_mat_nearnullspace-1_bddc_approx_hypre.counts
>>>>>>>>>> ok
>>>>>>>>>> snes_tutorials-ex56_attach_mat_nearnullspace-1_bddc_approx_hypre
>>>>>>>>>> ok
>>>>>>>>>> diff-snes_tutorials-ex56_attach_mat_nearnullspace-1_bddc_approx_hypre
>>>>>>>>>> TEST
>>>>>>>>>> arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex49_hypre_nullspace.counts
>>>>>>>>>> ok ksp_ksp_tutorials-ex49_hypre_nullspace
>>>>>>>>>> ok diff-ksp_ksp_tutorials-ex49_hypre_nullspace
>>>>>>>>>> TEST
>>>>>>>>>> arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex19_superlu_dist_2.counts
>>>>>>>>>> ok snes_tutorials-ex19_superlu_dist_2
>>>>>>>>>> ok diff-snes_tutorials-ex19_superlu_dist_2
>>>>>>>>>> TEST
>>>>>>>>>> arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex5_superlu_dist_2.counts
>>>>>>>>>> not ok ksp_ksp_tutorials-ex5_superlu_dist_2 # Error code: 1
>>>>>>>>>> #srun: error: Unable to create step for job 1426755: More
>>>>>>>>>> processors requested than permitted
>>>>>>>>>> ok ksp_ksp_tutorials-ex5_superlu_dist_2 # SKIP Command
>>>>>>>>>> failed so no diff
>>>>>>>>>> TEST
>>>>>>>>>> arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex56_attach_mat_nearnullspace-0_bddc_approx_hypre.counts
>>>>>>>>>> ok
>>>>>>>>>> snes_tutorials-ex56_attach_mat_nearnullspace-0_bddc_approx_hypre
>>>>>>>>>> ok
>>>>>>>>>> diff-snes_tutorials-ex56_attach_mat_nearnullspace-0_bddc_approx_hypre
>>>>>>>>>> TEST
>>>>>>>>>> arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex64_1.counts
>>>>>>>>>> ok ksp_ksp_tutorials-ex64_1
>>>>>>>>>> ok diff-ksp_ksp_tutorials-ex64_1
>>>>>>>>>> TEST
>>>>>>>>>> arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex5_superlu_dist.counts
>>>>>>>>>> not ok ksp_ksp_tutorials-ex5_superlu_dist # Error code: 1
>>>>>>>>>> #srun: error: Unable to create step for job 1426755: More
>>>>>>>>>> processors requested than permitted
>>>>>>>>>> ok ksp_ksp_tutorials-ex5_superlu_dist # SKIP Command
>>>>>>>>>> failed so no diff
>>>>>>>>>> TEST
>>>>>>>>>> arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex5f_superlu_dist_2.counts
>>>>>>>>>> ok ksp_ksp_tutorials-ex5f_superlu_dist_2 # SKIP Fortran
>>>>>>>>>> required for this test
>>>>>>>>>>
>>>>>>>>>>> Barry
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> On Mar 10, 2021, at 11:03 PM, Eric Chamberland
>>>>>>>>>>>> <Eric.Chamberland at giref.ulaval.ca
>>>>>>>>>>>> <mailto:Eric.Chamberland at giref.ulaval.ca>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Barry,
>>>>>>>>>>>>
>>>>>>>>>>>> to get a some follow up on --with-openmp=1 failures,
>>>>>>>>>>>> shall I open gitlab issues for:
>>>>>>>>>>>>
>>>>>>>>>>>> a) all hypre failures giving DIVERGED_INDEFINITE_PC
>>>>>>>>>>>>
>>>>>>>>>>>> b) all superlu_dist failures giving different results
>>>>>>>>>>>> with initia and "Exceeded timeout limit of 60 s"
>>>>>>>>>>>>
>>>>>>>>>>>> c) hpddm failures "free(): invalid next size (fast)"
>>>>>>>>>>>> and "Segmentation Violation"
>>>>>>>>>>>>
>>>>>>>>>>>> d) all tao's "Exceeded timeout limit of 60 s"
>>>>>>>>>>>>
>>>>>>>>>>>> I don't see how I could do all these debugging by myself...
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Eric
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Eric Chamberland, ing., M. Ing
>>>>>>>>> Professionnel de recherche
>>>>>>>>> GIREF/Université Laval
>>>>>>>>> (418) 656-2131 poste 41 22 42
>>>>>>>> --
>>>>>>>> Eric Chamberland, ing., M. Ing
>>>>>>>> Professionnel de recherche
>>>>>>>> GIREF/Université Laval
>>>>>>>> (418) 656-2131 poste 41 22 42
>>>>>>>> <fedora_mkl_and_devtools.txt><openmpi.txt><petsc.txt>
>>>>>>>
>>>>>>
>>>>>
>>>> --
>>>> Eric Chamberland, ing., M. Ing
>>>> Professionnel de recherche
>>>> GIREF/Université Laval
>>>> (418) 656-2131 poste 41 22 42
>>>
>> --
>> Eric Chamberland, ing., M. Ing
>> Professionnel de recherche
>> GIREF/Université Laval
>> (418) 656-2131 poste 41 22 42
>
--
Eric Chamberland, ing., M. Ing
Professionnel de recherche
GIREF/Université Laval
(418) 656-2131 poste 41 22 42
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20210317/e9589e3e/attachment-0001.html>
More information about the petsc-dev
mailing list