[petsc-users] Install PETSc with option `--with-shared-libraries=1` failed on MacOS

Pierre Jolivet pierre at joliv.et
Mon Mar 18 14:17:18 CDT 2024



> On 18 Mar 2024, at 7:59 PM, Satish Balay via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> On Mon, 18 Mar 2024, Satish Balay via petsc-users wrote:
> 
>> On Mon, 18 Mar 2024, Pierre Jolivet wrote:
>> 
>>> 
>>> 
>>>> On 18 Mar 2024, at 5:13 PM, Satish Balay via petsc-users <petsc-users at mcs.anl.gov> wrote:
>>>> 
>>>> Ah - the compiler did flag code bugs.
>>>> 
>>>>> (current version is 0.3.26 but we can’t update because there is a huge performance regression which makes the pipeline timeout)
>>>> 
>>>> maybe we should retry - updating to the latest snapshot and see if this issue persists.
>>> 
>>> Well, that’s easy to see it is _still_ broken: https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/jobs/6419779589__;!!G_uCfscf7eWS!f4svx7Rv1mmcLfy5l0C9bXXrw9gwb49ykkTb28IAtZW0VgZ8vgdD8exUOZSL0TCEqqP5X-p-0ll6TetPkw$ 
>>> The infamous gcc segfault that can’t let us run the pipeline, but that builds fine when it’s you that connect to the machine (I bothered you about this a couple of months ago in case you don’t remember, see https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7143__;!!G_uCfscf7eWS!f4svx7Rv1mmcLfy5l0C9bXXrw9gwb49ykkTb28IAtZW0VgZ8vgdD8exUOZSL0TCEqqP5X-p-0llrLiE4GQ$ ).
>> 
>>> make[2]: *** [../../Makefile.tail:46: libs] Bus error (core dumped)
>> 
>> Ah - ok - that's a strange error. I'm not sure how to debug it. [it fails when the build is invoked from configure - but not when its invoked directly from bash/shell.]
> 
> Pushed a potential workaround to jolivet/test-openblas

And here we go: https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/jobs/6420606887__;!!G_uCfscf7eWS!ZRl7bXHfAYjDN_AxaP28sbWmVsW1LJNw3_FdSSjv_R3X7Ol03i_HRQZ-5iro-4Y-w6JpmqnJrp6g33qwH26Uag$ 
20 minutes in, and still in the dm_* tests with timeouts right, left, and center.
For reference, this prior job https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/jobs/6418468279__;!!G_uCfscf7eWS!ZRl7bXHfAYjDN_AxaP28sbWmVsW1LJNw3_FdSSjv_R3X7Ol03i_HRQZ-5iro-4Y-w6JpmqnJrp6g33rzzaakGw$  completed in 3 minutes (OK, maybe add a couple of minutes to rebuild the packages to have a fair comparison).
What did they do to OpenBLAS? Add a sleep() in their axpy?

Thanks,
Pierre

> Note: The failure comes up on same OS (Fedora 39) on X64 aswell.
> 
> Satish
> 
>> 
>> Satish
>> 
>>> 
>>> Thanks,
>>> Pierre
>>> 
>>>> 
>>>> Satish
>>>> 
>>>> On Mon, 18 Mar 2024, Zongze Yang wrote:
>>>> 
>>>>> The issue of openblas was resolved by this pr https://urldefense.us/v3/__https://github.com/OpenMathLib/OpenBLAS/pull/4565__;!!G_uCfscf7eWS!b09n5clcTFuLceLY_9KfqtSsgmmCIBLFbqciRVCKvnvFw9zTaNF8ssK0MiQlBOXUJe7H88nl-7ExdfhB-cMXLQ2d$ 
>>>>> 
>>>>> Best wishes,
>>>>> Zongze
>>>>> 
>>>>>> On 18 Mar 2024, at 00:50, Zongze Yang <yangzongze at gmail.com> wrote:
>>>>>> 
>>>>>> It can be resolved by adding CFLAGS=-Wno-int-conversion. Perhaps the default behaviour of the new version compiler has been changed?
>>>>>> 
>>>>>> Best wishes,
>>>>>> Zongze
>>>>>>> On 18 Mar 2024, at 00:23, Satish Balay <balay at mcs.anl.gov> wrote:
>>>>>>> 
>>>>>>> Hm - I just tried a build with balay/xcode15-mpich - and that goes through fine for me. So don't know what the difference here is.
>>>>>>> 
>>>>>>> One difference is - I have a slightly older xcode. However your compiler appears to behave as using -Werror. Perhaps CFLAGS=-Wno-int-conversion will help here?
>>>>>>> 
>>>>>>> Satish
>>>>>>> 
>>>>>>> ----
>>>>>>> Executing: gcc --version
>>>>>>> stdout:
>>>>>>> Apple clang version 15.0.0 (clang-1500.3.9.4)
>>>>>>> 
>>>>>>> Executing: /Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/bin/mpicc -show
>>>>>>> stdout: gcc -fPIC -fno-stack-check -Qunused-arguments -g -O0 -Wno-implicit-function-declaration -fno-common -I/Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/include -L/Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/lib -lmpi -lpmpi
>>>>>>> 
>>>>>>> /Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/bin/mpicc -O2 -DMAX_STACK_ALLOC=2048 -Wall -DF_INTERFACE_GFORT -fPIC -DNO_WARMUP -DMAX_CPU_NUMBER=12 -DMAX_PARALLEL_NUMBER=1 -DBUILD_SINGLE=1 -DBUILD_DOUBLE=1 -DBUILD_COMPLEX=1 -DBUILD_COMPLEX16=1 -DVERSION=\"0.3.21\" -march=armv8-a -UASMNAME -UASMFNAME -UNAME -UCNAME -UCHAR_NAME -UCHAR_CNAME -DASMNAME=_lapack_wrappers -DASMFNAME=_lapack_wrappers_ -DNAME=lapack_wrappers_ -DCNAME=lapack_wrappers -DCHAR_NAME=\"lapack_wrappers_\" -DCHAR_CNAME=\"lapack_wrappers\" -DNO_AFFINITY -I.. -c src/lapack_wrappers.c -o src/lapack_wrappers.o
>>>>>>> src/lapack_wrappers.c:570:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion]
>>>>>>>  RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info);
>>>>>>>                                                                              ^~~~
>>>>>>>                                                                              &
>>>>>>> 
>>>>>>> vs:
>>>>>>> Executing: gcc --version
>>>>>>> stdout:
>>>>>>> Apple clang version 15.0.0 (clang-1500.1.0.2.5)
>>>>>>> 
>>>>>>> Executing: /Users/balay/petsc/arch-darwin-c-debug/bin/mpicc -show
>>>>>>> stdout: gcc -fPIC -fno-stack-check -Qunused-arguments -g -O0 -Wno-implicit-function-declaration -fno-common -I/Users/balay/petsc/arch-darwin-c-debug/include -L/Users/balay/petsc/arch-darwin-c-debug/lib -lmpi -lpmpi
>>>>>>> 
>>>>>>> 
>>>>>>> /Users/balay/petsc/arch-darwin-c-debug/bin/mpicc -O2 -DMAX_STACK_ALLOC=2048 -Wall -DF_INTERFACE_GFORT -fPIC -DNO_WARMUP -DMAX_CPU_NUMBER=24 -DMAX_PARALLEL_NUMBER=1 -DBUILD_SINGLE=1 -DBUILD_DOUBLE=1 -DBUILD_COMPLEX=1 -DBUILD_COMPLEX16=1 -DVERSION=\"0.3.21\" -march=armv8-a -UASMNAME -UASMFNAME -UNAME -UCNAME -UCHAR_NAME -UCHAR_CNAME -DASMNAME=_lapack_wrappers -DASMFNAME=_lapack_wrappers_ -DNAME=lapack_wrappers_ -DCNAME=lapack_wrappers -DCHAR_NAME=\"lapack_wrappers_\" -DCHAR_CNAME=\"lapack_wrappers\" -DNO_AFFINITY -I.. -c src/lapack_wrappers.c -o src/lapack_wrappers.o
>>>>>>> src/lapack_wrappers.c:570:81: warning: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion]
>>>>>>>  RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info);
>>>>>>>                                                                              ^~~~
>>>>>>>                                                                              &
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Sun, 17 Mar 2024, Pierre Jolivet wrote:
>>>>>>> 
>>>>>>>> Ah, my bad, I misread linux-opt-arm as a macOS runner, no wonder the option is not helping…
>>>>>>>> Take Barry’s advice.
>>>>>>>> Furthermore, it looks like OpenBLAS people are steering in the opposite direction as us, by forcing the use of ld-classic https://urldefense.us/v3/__https://github.com/OpenMathLib/OpenBLAS/commit/103d6f4e42fbe532ae4ea48e8d90d7d792bc93d2__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SrazFoooQ$ , so that’s another good argument in favor of -framework Accelerate.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Pierre
>>>>>>>> 
>>>>>>>> PS: anyone benchmarked those https://urldefense.us/v3/__https://developer.apple.com/documentation/accelerate/sparse_solvers__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SrpnDvT5g$  ? I didn’t even know they existed.
>>>>>>>> 
>>>>>>>>> On 17 Mar 2024, at 3:06 PM, Zongze Yang <yangzongze at gmail.com <mailto:yangzongze at gmail.com>> wrote:
>>>>>>>>> 
>>>>>>>>> This Message Is From an External Sender 
>>>>>>>>> This message came from outside your organization.
>>>>>>>>> Understood. Thank you for your advice.
>>>>>>>>> 
>>>>>>>>> Best wishes,
>>>>>>>>> Zongze
>>>>>>>>> 
>>>>>>>>>> On 17 Mar 2024, at 22:04, Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev> <mailto:bsmith at petsc.dev>> wrote:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> I would just avoid the --download-openblas  option. The BLAS/LAPACK provided by Apple should perform fine, perhaps even better than OpenBLAS on your system.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> On Mar 17, 2024, at 9:58 AM, Zongze Yang <yangzongze at gmail.com <mailto:yangzongze at gmail.com> <mailto:yangzongze at gmail.com>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> This Message Is From an External Sender 
>>>>>>>>>>> This message came from outside your organization.
>>>>>>>>>>> Adding the flag `--download-openblas-make-options=TARGET=GENERIC` did not resolve the issue. The same error persisted.
>>>>>>>>>>> 
>>>>>>>>>>> Best wishes,
>>>>>>>>>>> Zongze
>>>>>>>>>>> 
>>>>>>>>>>>> On 17 Mar 2024, at 20:58, Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et> <mailto:pierre at joliv.et>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>> On 17 Mar 2024, at 1:04 PM, Zongze Yang <yangzongze at gmail.com <mailto:yangzongze at gmail.com> <mailto:yangzongze at gmail.com>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thank you for providing the instructions. I try the first option.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Now, the error of the configuration is related to OpenBLAS.
>>>>>>>>>>>>> Add `--CFLAGS=-Wno-int-conversion` to configure command resolve this. Should this be reported to OpenBLAS? Or need to fix the configure in petsc?
>>>>>>>>>>>> 
>>>>>>>>>>>> I see our linux-opt-arm runner is using the additional flag '--download-openblas-make-options=TARGET=GENERIC', could you maybe try to add that as well?
>>>>>>>>>>>> I don’t think there is much to fix on our end, OpenBLAS has been very broken lately on arm (current version is 0.3.26 but we can’t update because there is a huge performance regression which makes the pipeline timeout).
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Pierre
>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The configure.log is attached. The errors are show below:
>>>>>>>>>>>>>  ```
>>>>>>>>>>>>>  src/lapack_wrappers.c:570:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion]
>>>>>>>>>>>>>      RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info);
>>>>>>>>>>>>>                                                                                  ^~~~
>>>>>>>>>>>>>                                                                                  &
>>>>>>>>>>>>>  src/../inc/relapack.h:74:216: note: passing argument to parameter here
>>>>>>>>>>>>>  void RELAPACK_sgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *);
>>>>>>>>>>>>>                                                                                                                                                                                        ^
>>>>>>>>>>>>>  src/lapack_wrappers.c:583:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion]
>>>>>>>>>>>>>      RELAPACK_dgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info);
>>>>>>>>>>>>>                                                                                  ^~~~
>>>>>>>>>>>>>                                                                                  &
>>>>>>>>>>>>>  src/../inc/relapack.h:75:221: note: passing argument to parameter here
>>>>>>>>>>>>>  void RELAPACK_dgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *);
>>>>>>>>>>>>>                                                                                                                                                                                        ^
>>>>>>>>>>>>>  src/lapack_wrappers.c:596:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion]
>>>>>>>>>>>>>      RELAPACK_cgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info);
>>>>>>>>>>>>>                                                                                  ^~~~
>>>>>>>>>>>>>                                                                                  &
>>>>>>>>>>>>>  src/../inc/relapack.h:76:216: note: passing argument to parameter here
>>>>>>>>>>>>>  void RELAPACK_cgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *);
>>>>>>>>>>>>>                                                                                                                                                                                        ^
>>>>>>>>>>>>>  src/lapack_wrappers.c:609:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion]
>>>>>>>>>>>>>      RELAPACK_zgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info);
>>>>>>>>>>>>>                                                                                  ^~~~
>>>>>>>>>>>>>                                                                                  &
>>>>>>>>>>>>>  src/../inc/relapack.h:77:221: note: passing argument to parameter here
>>>>>>>>>>>>>  void RELAPACK_zgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *);
>>>>>>>>>>>>>                                                                                                                                                                                        ^
>>>>>>>>>>>>>  4 errors generated.
>>>>>>>>>>>>>  ```
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Best wishes,
>>>>>>>>>>>>> Zongze
>>>>>>>>>>>>> 
>>>>>>>>>>>>> <configure.log.tar.gz>
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On 17 Mar 2024, at 18:48, Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et> <mailto:pierre at joliv.et>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> You need this MR https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7365__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SqG8HOUGQ$  <https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7365__;!!G_uCfscf7eWS!eCQRfbol7FDQiO0o78iDit2saij_ydIUtCfRQnsQAt-h_YcXr2Yi2BFnFnqHZp0FO3Lhpyr2RKdHZ-T-OF94HpwQ$>
>>>>>>>>>>>>>> main has been broken for macOS since https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7341__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9Soe8Kh_uQ$  <https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7341__;!!G_uCfscf7eWS!eCQRfbol7FDQiO0o78iDit2saij_ydIUtCfRQnsQAt-h_YcXr2Yi2BFnFnqHZp0FO3Lhpyr2RKdHZ-T-OIhlJwLx$>, so the alternative is to revert to the commit prior.
>>>>>>>>>>>>>> It should work either way.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Pierre
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On 17 Mar 2024, at 11:31 AM, Zongze Yang <yangzongze at gmail.com <mailto:yangzongze at gmail.com> <mailto:yangzongze at gmail.com>> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> This Message Is From an External Sender
>>>>>>>>>>>>>>> This message came from outside your organization.
>>>>>>>>>>>>>>> Hi, PETSc Team,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I am trying to install petsc with the following configuration
>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>> ./configure \
>>>>>>>>>>>>>>>  --download-bison \
>>>>>>>>>>>>>>>  --download-mpich \
>>>>>>>>>>>>>>>  --download-mpich-configure-arguments=--disable-opencl \
>>>>>>>>>>>>>>>  --download-hwloc \
>>>>>>>>>>>>>>>  --download-hwloc-configure-arguments=--disable-opencl \
>>>>>>>>>>>>>>>  --download-openblas \
>>>>>>>>>>>>>>>  --download-openblas-make-options="'USE_THREAD=0 USE_LOCKING=1 USE_OPENMP=0'" \
>>>>>>>>>>>>>>>  --with-shared-libraries=1 \
>>>>>>>>>>>>>>>  --with-fortran-bindings=0 \
>>>>>>>>>>>>>>>  --with-zlib \
>>>>>>>>>>>>>>>  LDFLAGS=-Wl,-ld_classic
>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The log shows that
>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>> Exhausted all shared linker guesses. Could not determine how to create a shared library!
>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I recently updated the system and Xcode, as well as homebrew.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The configure.log is attached.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thanks for your attention to this matter.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Best wishes,
>>>>>>>>>>>>>>> Zongze
>>>>>>>>>>>>>>> <configure.log.tar.gz>
>>>>>>>> 
>>>>>>> <configure.log.gz>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20240318/e6c24b37/attachment-0001.html>


More information about the petsc-users mailing list