From fab4100 at posteo.ch Fri Mar 1 03:40:11 2024 From: fab4100 at posteo.ch (Fabian Wermelinger) Date: Fri, 1 Mar 2024 09:40:11 +0000 Subject: [petsc-users] Clarification for use of MatMPIBAIJSetPreallocationCSR Message-ID: An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Mar 1 09:58:10 2024 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 1 Mar 2024 10:58:10 -0500 Subject: [petsc-users] Clarification for use of MatMPIBAIJSetPreallocationCSR In-Reply-To: References: Message-ID: On Fri, Mar 1, 2024 at 10:28?AM Fabian Wermelinger wrote: > Dear All, I am implementing a linear solver interface in a flow solver > with support for PETSc. My application uses a parallel CSR representation > and it manages the memory for it. I would like to wrap PETSc matrices (and > vectors) around it such > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Dear All, > > I am implementing a linear solver interface in a flow solver with support for > PETSc. My application uses a parallel CSR representation and it manages the > memory for it. I would like to wrap PETSc matrices (and vectors) around it such > that I can use the PETSc solvers as well. I plan to use > MatMPIBAIJSetPreallocationCSR and VecCreateMPIWithArray for lightweight > wrapping. The matrix structure is static over the course of iterations. I am > using a derived context class to host the PETSc related context. This context > holds references to the PETSc matrix and vectors and KSP/PC required to call the > solver API later in the iteration loop. I would like to create as much as > possible during creation of the context at the beginning of iterations (the > context will live through iterations). > > My understanding is that MatMPIBAIJSetPreallocationCSR and VecCreateMPIWithArray > DO NOT copy such that I can wrap the PETSc types around the memory managed by > the hosting linear solver framework in the application. The system matrix and > RHS (the pointers to these arrays are passed to MatMPIBAIJSetPreallocationCSR > and VecCreateMPIWithArray, respectively) is assembled by the application before > any call to a linear solver. > > Given this setting: for every iteration, my plan is the PETSc information from > the context (Mat, Vec, KSP) and simply call KSPSolve without any other PETSc > calls (still assuming the matrix structure is static during iteration). > > What is not clear to me: > > Are there any MatSetValues/VecSetValues calls followed by > MatAssembly/VecAssembly(Begin/End) calls required for this setting? > > I don't believe so. You are doing the assembly. If you want to tell PETSc (KSP) that you changed the matrix so that it will re-setup the solvers (eg, refactor) then you can call KSPSetOperators. (You might want to increment the state of the Mat so that other things like TS know the state has changed, maybe that is all that you need to do) > The data in > the arrays for which pointers have been passed to MatMPIBAIJSetPreallocationCSR > and VecCreateMPIWithArray is computed prior to any solver call in an iteration, > such that I am assuming no additional "set value" calls through PETSc are > required -> am I missing something important by assuming this? > > That sounds fine, but keep in mind that you need to use our data layout for the blocks if you use our MatVec, etc. Thanks, Mark > Thank you for taking the time! > > -- > fabs > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fab4100 at posteo.ch Fri Mar 1 10:03:33 2024 From: fab4100 at posteo.ch (Fabian Wermelinger) Date: Fri, 1 Mar 2024 16:03:33 +0000 Subject: [petsc-users] Clarification for use of MatMPIBAIJSetPreallocationCSR In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Fri Mar 1 10:09:17 2024 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Fri, 1 Mar 2024 10:09:17 -0600 Subject: [petsc-users] Clarification for use of MatMPIBAIJSetPreallocationCSR In-Reply-To: References: Message-ID: On Fri, Mar 1, 2024 at 9:28?AM Fabian Wermelinger wrote: > Dear All, I am implementing a linear solver interface in a flow solver > with support for PETSc. My application uses a parallel CSR representation > and it manages the memory for it. I would like to wrap PETSc matrices (and > vectors) around it such > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Dear All, > > I am implementing a linear solver interface in a flow solver with support for > PETSc. My application uses a parallel CSR representation and it manages the > memory for it. I would like to wrap PETSc matrices (and vectors) around it such > that I can use the PETSc solvers as well. I plan to use > MatMPIBAIJSetPreallocationCSR and VecCreateMPIWithArray for lightweight > wrapping. The matrix structure is static over the course of iterations. I am > using a derived context class to host the PETSc related context. This context > holds references to the PETSc matrix and vectors and KSP/PC required to call the > solver API later in the iteration loop. I would like to create as much as > possible during creation of the context at the beginning of iterations (the > context will live through iterations). > > My understanding is that MatMPIBAIJSetPreallocationCSR and VecCreateMPIWithArray > DO NOT copy such that I can wrap the PETSc types around the memory managed by > the hosting linear solver framework in the application. The system matrix and > RHS (the pointers to these arrays are passed to MatMPIBAIJSetPreallocationCSR > and VecCreateMPIWithArray, respectively) is assembled by the application before > any call to a linear solver. > > No. MatMPIBAIJSetPreallocationCSR() copies the data, but VecCreateMPIWithArray() does not copy (only use pointers user provided). PETSc MATMPIAIJ or MATMPIBAIJ has a complicated internal data structure. It is not easy for users to get it right. I think your intention is to avoid memory copies. Don't worry about it too much. If it is MATMPIAIJ, we can do MatMPIAIJSetPreallocationCSR(A, i, j, v); // let petsc copy i, j v // To quickly update A when you have an updated v[] MatUpdateMPIAIJWithArray(A, v); // copy v[], but faster than MatSetValues() But it seems we currently do not have a MatUpdateMPIBAIJWithArray() :( > Given this setting: for every iteration, my plan is the PETSc information from > the context (Mat, Vec, KSP) and simply call KSPSolve without any other PETSc > calls (still assuming the matrix structure is static during iteration). > > What is not clear to me: > > Are there any MatSetValues/VecSetValues calls followed by > MatAssembly/VecAssembly(Begin/End) calls required for this setting? The data in > the arrays for which pointers have been passed to MatMPIBAIJSetPreallocationCSR > and VecCreateMPIWithArray is computed prior to any solver call in an iteration, > such that I am assuming no additional "set value" calls through PETSc are > required -> am I missing something important by assuming this? > > Thank you for taking the time! > > -- > fabs > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fab4100 at posteo.ch Fri Mar 1 11:08:51 2024 From: fab4100 at posteo.ch (Fabian Wermelinger) Date: Fri, 1 Mar 2024 17:08:51 +0000 Subject: [petsc-users] Clarification for use of MatMPIBAIJSetPreallocationCSR In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Fri Mar 1 12:51:45 2024 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Fri, 1 Mar 2024 12:51:45 -0600 Subject: [petsc-users] Clarification for use of MatMPIBAIJSetPreallocationCSR In-Reply-To: References: Message-ID: On Fri, Mar 1, 2024 at 11:10?AM Fabian Wermelinger wrote: > On Fri, 01 Mar 2024 10:09:17 -0600, Junchao Zhang wrote: > >No. MatMPIBAIJSetPreallocationCSR() copies the data, but > >VecCreateMPIWithArray() does not copy (only use pointers user provided). > > OK, my understanding was that MatMPIBAIJSetPreallocationCSR DOES NOT copy > since > it has direct access to matrix entries in the required PETSc layout -> > thank you > > >I think your intention is to avoid memory copies. > > Yes. (not only copies but also avoid unnecessary increase of memory > footprint) > > >If it is MATMPIAIJ, we can do > > MatMPIAIJSetPreallocationCSR(A, i, j, v); // let petsc copy i, j v > > > > // To quickly update A when you have an updated v[] > > > > MatUpdateMPIAIJWithArray(A, v); // copy v[], but faster than > MatSetValues() > > Thanks for this input! > > >But it seems we currently do not have a MatUpdateMPIBAIJWithArray() :( > > So when my v[] changed during iterations (only the v[] used during the > MatMPIBAIJSetPreallocationCSR call, the matrix structure/connectivity has > NOT > changed), it is not sufficient to just call KSPSetOperators() before > calling > KSPSolve()? Correct, your v[] is not shared with petsc > I must copy the updated values in v[] explicitly into the PETSc > matrix, then call KSPSetOperators() followed by KSPSolve()? (for every > iteration in my application) > Yes > > The preferred method then is to use MatSetValues() for the copies (or > possibly > MatSetValuesRow() or MatSetValuesBlocked())? > MatSetValuesRow() > > The term "Preallocation" is confusing to me. For example, > MatCreateMPIBAIJWithArrays clearly states in the doc that arrays are copied > (https://urldefense.us/v3/__https://petsc.org/release/manualpages/Mat/MatCreateMPIBAIJWithArrays/__;!!G_uCfscf7eWS!YLXb4luirYVduVkmxILIpET7hYcMAPTHUxHG7tR7os-mxUGGqYt_XTB0xPp5y5CU2gWNtgNEmD8gu3rmtqcqNfLfc2Oe$ ), > I would > then assume PETSc maintains internal storage for it. If something is > preallocated, I would not make that assumption. > "Preallocation" in petsc means "tell petsc sizes of rows in a matrix", so that petsc can preallocate the memory before you do MatSetValues(). This is clearer in https://urldefense.us/v3/__https://petsc.org/release/manualpages/Mat/MatMPIAIJSetPreallocation/__;!!G_uCfscf7eWS!YLXb4luirYVduVkmxILIpET7hYcMAPTHUxHG7tR7os-mxUGGqYt_XTB0xPp5y5CU2gWNtgNEmD8gu3rmtqcqNdzTPnSf$ > Thank you for your time, I appreciate the inputs! > > All best, > > -- > fabs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sblondel at utk.edu Fri Mar 1 15:32:40 2024 From: sblondel at utk.edu (Blondel, Sophie) Date: Fri, 1 Mar 2024 21:32:40 +0000 Subject: [petsc-users] PAMI error on Summit In-Reply-To: References: Message-ID: I have been using --smpiargs "-gpu". I tried the benchmark with "jsrun --smpiargs "-gpu" -n 6 -a 1 -c 1 -g 1 /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos/src/ksp/ksp/tutorials/bench_kspsolve -mat_type aijkokkos -use_gpu_aware_mpi 0" and it seems to work: Fri Mar 1 16:27:14 EST 2024 =========================================== Test: KSP performance - Poisson Input matrix: 27-pt finite difference stencil -n 100 DoFs = 1000000 Number of nonzeros = 26463592 Step1 - creating Vecs and Mat... Step2 - running KSPSolve()... Step3 - calculating error norm... Error norm: 5.591e-02 KSP iters: 63 KSPSolve: 3.16646 seconds FOM: 3.158e+05 DoFs/sec =========================================== ------------------------------------------------------------ Sender: LSF System Subject: Job 3322694: in cluster Done Job was submitted from host by user in cluster at Fri Mar 1 16:26:58 2024 Job was executed on host(s) <1*batch3>, in queue , as user in cluster at Fri Mar 1 16:27:00 2024 <42*a35n05> was used as the home directory. was used as the working directory. Started at Fri Mar 1 16:27:00 2024 Terminated at Fri Mar 1 16:27:26 2024 Results reported at Fri Mar 1 16:27:26 2024 The output (if any) is above this job summary. If I switch to "jsrun --smpiargs "-gpu" -n 6 -a 1 -c 1 -g 1 /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos/src/ksp/ksp/tutorials/bench_kspsolve -mat_type aijkokkos -use_gpu_aware_mpi 1" it complains: Fri Mar 1 16:25:02 EST 2024 =========================================== Test: KSP performance - Poisson Input matrix: 27-pt finite difference stencil -n 100 DoFs = 1000000 Number of nonzeros = 26463592 Step1 - creating Vecs and Mat... [5]PETSC ERROR: PETSc is configured with GPU support, but your MPI is not GPU-aware. For better performance, please use a GPU-aware MPI. [5]PETSC ERROR: If you do not care, add option -use_gpu_aware_mpi 0. To not see the message again, add the option to your .petscrc, OR add it to the env var PETSC_OPTIONS. [5]PETSC ERROR: If you do care, for IBM Spectrum MPI on OLCF Summit, you may need jsrun --smpiargs=-gpu. [5]PETSC ERROR: For Open MPI, you need to configure it --with-cuda (https://urldefense.us/v3/__https://www.open-mpi.org/faq/?category=buildcuda__;!!G_uCfscf7eWS!aLysH-zjWmDwsHlAFfiaeMvNJbnCcCztIFruGWStqtDV6RM6j9Xq3dxWMo1b-PVhyKP8XJ_ZhAD1ku70lGtgr1Nj$ ) [5]PETSC ERROR: For MVAPICH2-GDR, you need to set MV2_USE_CUDA=1 (https://urldefense.us/v3/__http://mvapich.cse.ohio-state.edu/userguide/gdr/__;!!G_uCfscf7eWS!aLysH-zjWmDwsHlAFfiaeMvNJbnCcCztIFruGWStqtDV6RM6j9Xq3dxWMo1b-PVhyKP8XJ_ZhAD1ku70lPt0OjLW$ ) [5]PETSC ERROR: For Cray-MPICH, you need to set MPICH_GPU_SUPPORT_ENABLED=1 (man mpi to see manual of cray-mpich) -------------------------------------------------------------------------- MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_SELF with errorcode 76. Best, Sophie ________________________________ From: Junchao Zhang Sent: Thursday, February 29, 2024 17:09 To: Blondel, Sophie Cc: xolotl-psi-development at lists.sourceforge.net ; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PAMI error on Summit You don't often get email from junchao.zhang at gmail.com. Learn why this is important Could you try a petsc example to see if the environment is good? For example, cd src/ksp/ksp/tutorials make bench_kspsolve mpirun -n 6 ./bench_kspsolve -mat_type aijkokkos -use_gpu_aware_mpi {0 or 1} BTW, I remember to use gpu-aware mpi on Summit, one needs to pass --smpiargs "-gpu" to jsrun --Junchao Zhang On Thu, Feb 29, 2024 at 3:22?PM Blondel, Sophie via petsc-users > wrote: I still get the same error when deactivating GPU-aware MPI. I also tried unloading spectrum MPI and using openMPI instead (recompiling everything) and I get a segfault in PETSc in that case (still using GPU-aware MPI I think, at least not explicitly ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd I still get the same error when deactivating GPU-aware MPI. I also tried unloading spectrum MPI and using openMPI instead (recompiling everything) and I get a segfault in PETSc in that case (still using GPU-aware MPI I think, at least not explicitly turning it off): 0 TS dt 1e-12 time 0. [ERROR] [0]PETSC ERROR: [ERROR] ------------------------------------------------------------------------ [ERROR] [0]PETSC ERROR: [ERROR] Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [ERROR] [0]PETSC ERROR: [ERROR] Try option -start_in_debugger or -on_error_attach_debugger [ERROR] [0]PETSC ERROR: [ERROR] or see https://urldefense.us/v3/__https://petsc.org/release/faq/*valgrind__;Iw!!G_uCfscf7eWS!aLysH-zjWmDwsHlAFfiaeMvNJbnCcCztIFruGWStqtDV6RM6j9Xq3dxWMo1b-PVhyKP8XJ_ZhAD1ku70lHxspwrD$ and https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!aLysH-zjWmDwsHlAFfiaeMvNJbnCcCztIFruGWStqtDV6RM6j9Xq3dxWMo1b-PVhyKP8XJ_ZhAD1ku70lG1rgai_$ [ERROR] [0]PETSC ERROR: [ERROR] or try https://urldefense.us/v3/__https://docs.nvidia.com/cuda/cuda-memcheck/index.html__;!!G_uCfscf7eWS!aLysH-zjWmDwsHlAFfiaeMvNJbnCcCztIFruGWStqtDV6RM6j9Xq3dxWMo1b-PVhyKP8XJ_ZhAD1ku70lOKZK09-$ on NVIDIA CUDA systems to find memory corruption errors [ERROR] [0]PETSC ERROR: [ERROR] configure using --with-debugging=yes, recompile, link, and run [ERROR] [0]PETSC ERROR: [ERROR] to get more information on the crash. [ERROR] [0]PETSC ERROR: [ERROR] Run with -malloc_debug to check if memory corruption is causing the crash. -------------------------------------------------------------------------- Best, Sophie ________________________________ From: Blondel, Sophie via Xolotl-psi-development > Sent: Thursday, February 29, 2024 10:17 To: xolotl-psi-development at lists.sourceforge.net >; petsc-users at mcs.anl.gov > Subject: [Xolotl-psi-development] PAMI error on Summit Hi, I am using PETSc build with the Kokkos CUDA backend on Summit but when I run my code with multiple MPI tasks I get the following error: 0 TS dt 1e-12 time 0. errno 14 pid 864558 xolotl: /__SMPI_build_dir__________________________/ibmsrc/pami/ibm-pami/buildtools/pami_build_port/../pami/components/devices/shmem/shaddr/CMAShaddr.h:164: size_t PAMI::Dev ice::Shmem::CMAShaddr::read_impl(PAMI::Memregion*, size_t, PAMI::Memregion*, size_t, size_t, bool*): Assertion `cbytes > 0' failed. errno 14 pid 864557 xolotl: /__SMPI_build_dir__________________________/ibmsrc/pami/ibm-pami/buildtools/pami_build_port/../pami/components/devices/shmem/shaddr/CMAShaddr.h:164: size_t PAMI::Dev ice::Shmem::CMAShaddr::read_impl(PAMI::Memregion*, size_t, PAMI::Memregion*, size_t, size_t, bool*): Assertion `cbytes > 0' failed. [e28n07:864557] *** Process received signal *** [e28n07:864557] Signal: Aborted (6) [e28n07:864557] Signal code: (-6) [e28n07:864557] [ 0] linux-vdso64.so.1(__kernel_sigtramp_rt64+0x0)[0x2000000604d8] [e28n07:864557] [ 1] /lib64/glibc-hwcaps/power9/libc-2.28.so(gsignal+0xd8)[0x200005d796f8] [e28n07:864557] [ 2] /lib64/glibc-hwcaps/power9/libc-2.28.so(abort+0x164)[0x200005d53ff4] [e28n07:864557] [ 3] /lib64/glibc-hwcaps/power9/libc-2.28.so(+0x3d280)[0x200005d6d280] [e28n07:864557] [ 4] [e28n07:864558] *** Process received signal *** [e28n07:864558] Signal: Aborted (6) [e28n07:864558] Signal code: (-6) [e28n07:864558] [ 0] linux-vdso64.so.1(__kernel_sigtramp_rt64+0x0)[0x2000000604d8] [e28n07:864558] [ 1] /lib64/glibc-hwcaps/power9/libc-2.28.so(gsignal+0xd8)[0x200005d796f8] [e28n07:864558] [ 2] /lib64/glibc-hwcaps/power9/libc-2.28.so(abort+0x164)[0x200005d53ff4] [e28n07:864558] [ 3] /lib64/glibc-hwcaps/power9/libc-2.28.so(+0x3d280)[0x200005d6d280] [e28n07:864558] [ 4] /lib64/glibc-hwcaps/power9/libc-2.28.so(__assert_fail+0x64)[0x200005d6d324] [e28n07:864557] [ 5] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 (_ZN4PAMI8Protocol3Get7GetRdmaINS_6Device5Shmem8DmaModelINS3_11ShmemDeviceINS_4Fifo8WrapFifoINS7_10FifoPacketILj64ELj4096EEENS_7Counter15IndirectBoundedINS_6Atomic12NativeAt omicEEELj256EEENSB_8IndirectINSB_6NativeEEENS4_9CMAShaddrELj256ELj512EEELb0EEESL_E6simpleEP18pami_rget_simple_t+0x1d8)[0x20007f3971d8] [e28n07:864557] [ 6] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 (_ZN4PAMI8Protocol3Get13CompositeRGetINS1_4RGetES3_E6simpleEP18pami_rget_simple_t+0x40)[0x20007f2ecc10] [e28n07:864557] [ 7] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 (_ZN4PAMI7Context9rget_implEP18pami_rget_simple_t+0x28c)[0x20007f31a78c] [e28n07:864557] [ 8] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 (PAMI_Rget+0x18)[0x20007f2d94a8] [e28n07:864557] [ 9] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/spectrum_mpi/mca_pml_p ami.so(process_rndv_msg+0x46c)[0x2000a80159ac] [e28n07:864557] [10] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/spectrum_mpi/mca_pml_p ami.so(pml_pami_recv_rndv_cb+0x2bc)[0x2000a801670c] [e28n07:864557] [11] /lib64/glibc-hwcaps/power9/libc-2.28.so(__assert_fail+0x64)[0x200005d6d324] [e28n07:864558] [ 5] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 (_ZN4PAMI8Protocol3Get7GetRdmaINS_6Device5Shmem8DmaModelINS3_11ShmemDeviceINS_4Fifo8WrapFifoINS7_10FifoPacketILj64ELj4096EEENS_7Counter15IndirectBoundedINS_6Atomic12NativeAt omicEEELj256EEENSB_8IndirectINSB_6NativeEEENS4_9CMAShaddrELj256ELj512EEELb0EEESL_E6simpleEP18pami_rget_simple_t+0x1d8)[0x20007f3971d8] [e28n07:864558] [ 6] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 (_ZN4PAMI8Protocol3Get13CompositeRGetINS1_4RGetES3_E6simpleEP18pami_rget_simple_t+0x40)[0x20007f2ecc10] [e28n07:864558] [ 7] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 (_ZN4PAMI7Context9rget_implEP18pami_rget_simple_t+0x28c)[0x20007f31a78c] [e28n07:864558] [ 8] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 (PAMI_Rget+0x18)[0x20007f2d94a8] [e28n07:864558] [ 9] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/spectrum_mpi/mca_pml_p ami.so(process_rndv_msg+0x46c)[0x2000a80159ac] [e28n07:864558] [10] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/spectrum_mpi/mca_pml_p ami.so(pml_pami_recv_rndv_cb+0x2bc)[0x2000a801670c] [e28n07:864558] [11] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 (_ZN4PAMI8Protocol4Send11EagerSimpleINS_6Device5Shmem11PacketModelINS3_11ShmemDeviceINS_4Fifo8WrapFifoINS7_10FifoPacketILj64ELj4096EEENS_7Counter15IndirectBoundedINS_6Atomic 12NativeAtomicEEELj256EEENSB_8IndirectINSB_6NativeEEENS4_9CMAShaddrELj256ELj512EEEEELNS1_15configuration_tE5EE15dispatch_packedEPvSP_mSP_SP_+0x4c)[0x20007f2e30ac] [e28n07:864557] [12] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 (PAMI_Context_advancev+0x6b0)[0x20007f2da540] [e28n07:864557] [13] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/spectrum_mpi/mca_pml_p ami.so(mca_pml_pami_progress+0x34)[0x2000a80073e4] [e28n07:864557] [14] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/libopen-pal.so.3(opal_ progress+0x6c)[0x20003d60640c] [e28n07:864557] [15] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/libmpi_ibm.so.3(ompi_r equest_default_wait_all+0x144)[0x2000034c4b04] [e28n07:864557] [16] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/libmpi_ibm.so.3(PMPI_W aitall+0x10c)[0x20000352790c] [e28n07:864557] [17] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 (_ZN4PAMI8Protocol4Send11EagerSimpleINS_6Device5Shmem11PacketModelINS3_11ShmemDeviceINS_4Fifo8WrapFifoINS7_10FifoPacketILj64ELj4096EEENS_7Counter15IndirectBoundedINS_6Atomic 12NativeAtomicEEELj256EEENSB_8IndirectINSB_6NativeEEENS4_9CMAShaddrELj256ELj512EEEEELNS1_15configuration_tE5EE15dispatch_packedEPvSP_mSP_SP_+0x4c)[0x20007f2e30ac] [e28n07:864558] [12] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 (PAMI_Context_advancev+0x6b0)[0x20007f2da540] [e28n07:864558] [13] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/spectrum_mpi/mca_pml_p ami.so(mca_pml_pami_progress+0x34)[0x2000a80073e4] [e28n07:864558] [14] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/libopen-pal.so.3(opal_ progress+0x6c)[0x20003d60640c] [e28n07:864558] [15] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/libmpi_ibm.so.3(ompi_r equest_default_wait_all+0x144)[0x2000034c4b04] [e28n07:864558] [16] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/libmpi_ibm.so.3(PMPI_W aitall+0x10c)[0x20000352790c] [e28n07:864558] [17] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x3ca7b0)[0x2000004ea7b0] [e28n07:864557] [18] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x3ca7b0)[0x2000004ea7b0] [e28n07:864558] [18] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x3c5e68)[0x2000004e5e68] [e28n07:864557] [19] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x3c5e68)[0x2000004e5e68] [e28n07:864558] [19] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(PetscSFBcastEnd+0x74)[0x2000004c9214] [e28n07:864557] [20] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(PetscSFBcastEnd+0x74)[0x2000004c9214] [e28n07:864558] [20] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x3b4cb0)[0x2000004d4cb0] [e28n07:864557] [21] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x3b4cb0)[0x2000004d4cb0] [e28n07:864558] [21] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(VecScatterEnd+0x178)[0x2000004dd038] [e28n07:864558] [22] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(VecScatterEnd+0x178)[0x2000004dd038] [e28n07:864557] [22] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x1112be0)[0x200001232be0] [e28n07:864558] [23] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x1112be0)[0x200001232be0] [e28n07:864557] [23] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(DMGlobalToLocalEnd+0x470)[0x200000e9b0f0] [e28n07:864557] [24] /gpfs/alpine2/mat267/proj-shared/code/xolotl-stable-cuda/xolotl/solver/libxolotlSolver.so(_ZN6xolotl6solver11PetscSolver11rhsFunctionEP5_p_TSdP6_p_VecS5 _+0xc4)[0x200005f710d4] [e28n07:864557] [25] /gpfs/alpine2/mat267/proj-shared/code/xolotl-stable-cuda/xolotl/solver/libxolotlSolver.so(_ZN6xolotl6solver11RHSFunctionEP5_p_TSdP6_p_VecS4_Pv+0x2c)[0x2 00005f7130c] [e28n07:864557] [26] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(DMGlobalToLocalEnd+0x470)[0x200000e9b0f0] [e28n07:864558] [24] /gpfs/alpine2/mat267/proj-shared/code/xolotl-stable-cuda/xolotl/solver/libxolotlSolver.so(_ZN6xolotl6solver11PetscSolver11rhsFunctionEP5_p_TSdP6_p_VecS5 _+0xc4)[0x200005f710d4] [e28n07:864558] [25] /gpfs/alpine2/mat267/proj-shared/code/xolotl-stable-cuda/xolotl/solver/libxolotlSolver.so(_ZN6xolotl6solver11RHSFunctionEP5_p_TSdP6_p_VecS4_Pv+0x2c)[0x2 00005f7130c] [e28n07:864558] [26] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(TSComputeRHSFunction+0x1bc)[0x2000017621dc] [e28n07:864557] [27] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(TSComputeRHSFunction+0x1bc)[0x2000017621dc] [e28n07:864558] [27] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(TSComputeIFunction+0x418)[0x200001763ad8] [e28n07:864557] [28] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(TSComputeIFunction+0x418)[0x200001763ad8] [e28n07:864558] [28] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x16f2ef0)[0x200001812ef0] [e28n07:864557] [29] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x16f2ef0)[0x200001812ef0] [e28n07:864558] [29] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(TSStep+0x228)[0x200001768088] [e28n07:864557] *** End of error message *** /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(TSStep+0x228)[0x200001768088] [e28n07:864558] *** End of error message *** It seems to be pointing to https://urldefense.us/v3/__https://petsc.org/release/manualpages/PetscSF/PetscSFBcastEnd/__;!!G_uCfscf7eWS!aLysH-zjWmDwsHlAFfiaeMvNJbnCcCztIFruGWStqtDV6RM6j9Xq3dxWMo1b-PVhyKP8XJ_ZhAD1ku70lCtt9Oz0$ so I wanted to check if you had seen this type of error before and if it could be related to how the code is compiled or run. Let me know if I can provide any additional information. Best, Sophie -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Fri Mar 1 15:58:00 2024 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Fri, 1 Mar 2024 15:58:00 -0600 Subject: [petsc-users] PAMI error on Summit In-Reply-To: References: Message-ID: It is weird, with jsrun --smpiargs "-gpu" -n 6 -a 1 -c 1 -g 1 /gpfs/alpine2/mat267/proj- shared/dependencies/petsc-kokkos/src/ksp/ksp/tutorials/bench_kspsolve -mat_type aijkokkos -use_gpu_aware_mpi 1 petsc tried to test if the MPI is gpu aware (by doing an MPI_Allreduce on device buffers). It tried and found it was not, so it threw out the complaint in the error message. From https://urldefense.us/v3/__https://docs.olcf.ornl.gov/systems/summit_user_guide.html*cuda-aware-mpi__;Iw!!G_uCfscf7eWS!dWQ1cCpmozMz4HPnFYCP7THRdg2r3s_6eD0IHbiJcn-3jWT-gNsmtjpP6h0x9jLoOdiQMrZ1wRI-83YJw6XnfuQMkmJQ$ , I think your flags were right. I just got my Summit account reactivated today. I will give it a try. --Junchao Zhang On Fri, Mar 1, 2024 at 3:32?PM Blondel, Sophie wrote: > I have been using --smpiargs "-gpu". > > I tried the benchmark with "jsrun --smpiargs "-gpu" -n 6 -a 1 -c 1 -g 1 > /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos/src/ksp/ksp/tutorials/bench_kspsolve > -mat_type aijkokkos -use_gpu_aware_mpi 0" and it seems to work: > Fri Mar 1 16:27:14 EST 2024 > =========================================== > Test: KSP performance - Poisson > Input matrix: 27-pt finite difference stencil > -n 100 > DoFs = 1000000 > Number of nonzeros = 26463592 > > Step1 - creating Vecs and Mat... > Step2 - running KSPSolve()... > Step3 - calculating error norm... > > Error norm: 5.591e-02 > KSP iters: 63 > KSPSolve: 3.16646 seconds > FOM: 3.158e+05 DoFs/sec > =========================================== > > ------------------------------------------------------------ > Sender: LSF System > Subject: Job 3322694: in cluster Done > > Job was submitted from host by user in cluster > at Fri Mar 1 16:26:58 2024 > Job was executed on host(s) <1*batch3>, in queue , as user in > cluster at Fri Mar 1 16:27:00 2024 > <42*a35n05> > was used as the home directory. > was used as the working directory. > Started at Fri Mar 1 16:27:00 2024 > Terminated at Fri Mar 1 16:27:26 2024 > Results reported at Fri Mar 1 16:27:26 2024 > > The output (if any) is above this job summary. > > > If I switch to "jsrun --smpiargs "-gpu" -n 6 -a 1 -c 1 -g 1 > /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos/src/ksp/ksp/tutorials/bench_kspsolve > -mat_type aijkokkos -use_gpu_aware_mpi 1" it complains: > Fri Mar 1 16:25:02 EST 2024 > =========================================== > Test: KSP performance - Poisson > Input matrix: 27-pt finite difference stencil > -n 100 > DoFs = 1000000 > Number of nonzeros = 26463592 > > Step1 - creating Vecs and Mat... > [5]PETSC ERROR: PETSc is configured with GPU support, but your MPI is not > GPU-aware. For better performance, please use a GPU-aware MPI. > [5]PETSC ERROR: If you do not care, add option -use_gpu_aware_mpi 0. To > not see the message again, add the option to your .petscrc, OR add it to > the env var PETSC_OPTIONS. > [5]PETSC ERROR: If you do care, for IBM Spectrum MPI on OLCF Summit, you > may need jsrun --smpiargs=-gpu. > [5]PETSC ERROR: For Open MPI, you need to configure it --with-cuda ( > https://urldefense.us/v3/__https://www.open-mpi.org/faq/?category=buildcuda__;!!G_uCfscf7eWS!dWQ1cCpmozMz4HPnFYCP7THRdg2r3s_6eD0IHbiJcn-3jWT-gNsmtjpP6h0x9jLoOdiQMrZ1wRI-83YJw6XnfqFNiwyb$ ) > [5]PETSC ERROR: For MVAPICH2-GDR, you need to set MV2_USE_CUDA=1 ( > https://urldefense.us/v3/__http://mvapich.cse.ohio-state.edu/userguide/gdr/__;!!G_uCfscf7eWS!dWQ1cCpmozMz4HPnFYCP7THRdg2r3s_6eD0IHbiJcn-3jWT-gNsmtjpP6h0x9jLoOdiQMrZ1wRI-83YJw6XnflzOHPZF$ ) > [5]PETSC ERROR: For Cray-MPICH, you need to set > MPICH_GPU_SUPPORT_ENABLED=1 (man mpi to see manual of cray-mpich) > -------------------------------------------------------------------------- > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_SELF > with errorcode 76. > > Best, > > Sophie > ------------------------------ > *From:* Junchao Zhang > *Sent:* Thursday, February 29, 2024 17:09 > *To:* Blondel, Sophie > *Cc:* xolotl-psi-development at lists.sourceforge.net < > xolotl-psi-development at lists.sourceforge.net>; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *Subject:* Re: [petsc-users] PAMI error on Summit > > You don't often get email from junchao.zhang at gmail.com. Learn why this is > important > Could you try a petsc example to see if the environment is good? > For example, > > cd src/ksp/ksp/tutorials > make bench_kspsolve > mpirun -n 6 ./bench_kspsolve -mat_type aijkokkos -use_gpu_aware_mpi {0 or > 1} > > BTW, I remember to use gpu-aware mpi on Summit, one needs to pass > --smpiargs "-gpu" to jsrun > > --Junchao Zhang > > > On Thu, Feb 29, 2024 at 3:22?PM Blondel, Sophie via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > I still get the same error when deactivating GPU-aware MPI. I also tried > unloading spectrum MPI and using openMPI instead (recompiling everything) > and I get a segfault in PETSc in that case (still using GPU-aware MPI I > think, at least not explicitly > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > I still get the same error when deactivating GPU-aware MPI. > > I also tried unloading spectrum MPI and using openMPI instead (recompiling > everything) and I get a segfault in PETSc in that case (still using > GPU-aware MPI I think, at least not explicitly turning it off): > > 0 TS dt 1e-12 time 0. > > [ERROR] [0]PETSC ERROR: > > [ERROR] > ------------------------------------------------------------------------ > > [ERROR] [0]PETSC ERROR: > > [ERROR] Caught signal number 11 SEGV: Segmentation Violation, probably > memory access out of range > > [ERROR] [0]PETSC ERROR: > > [ERROR] Try option -start_in_debugger or -on_error_attach_debugger > > [ERROR] [0]PETSC ERROR: > > [ERROR] or see https://urldefense.us/v3/__https://petsc.org/release/faq/*valgrind__;Iw!!G_uCfscf7eWS!dWQ1cCpmozMz4HPnFYCP7THRdg2r3s_6eD0IHbiJcn-3jWT-gNsmtjpP6h0x9jLoOdiQMrZ1wRI-83YJw6XnfiPz0QSo$ > and > https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!dWQ1cCpmozMz4HPnFYCP7THRdg2r3s_6eD0IHbiJcn-3jWT-gNsmtjpP6h0x9jLoOdiQMrZ1wRI-83YJw6Xnfp3itLqI$ > > > [ERROR] [0]PETSC ERROR: > > [ERROR] or try https://urldefense.us/v3/__https://docs.nvidia.com/cuda/cuda-memcheck/index.html__;!!G_uCfscf7eWS!dWQ1cCpmozMz4HPnFYCP7THRdg2r3s_6eD0IHbiJcn-3jWT-gNsmtjpP6h0x9jLoOdiQMrZ1wRI-83YJw6XnftilmehD$ > on > NVIDIA CUDA systems to find memory corruption errors > > [ERROR] [0]PETSC ERROR: > > [ERROR] configure using --with-debugging=yes, recompile, link, and run > > [ERROR] [0]PETSC ERROR: > > [ERROR] to get more information on the crash. > > [ERROR] [0]PETSC ERROR: > > [ERROR] Run with -malloc_debug to check if memory corruption is causing > the crash. > > -------------------------------------------------------------------------- > > Best, > > Sophie > ------------------------------ > *From:* Blondel, Sophie via Xolotl-psi-development < > xolotl-psi-development at lists.sourceforge.net> > *Sent:* Thursday, February 29, 2024 10:17 > *To:* xolotl-psi-development at lists.sourceforge.net < > xolotl-psi-development at lists.sourceforge.net>; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *Subject:* [Xolotl-psi-development] PAMI error on Summit > > Hi, > > I am using PETSc build with the Kokkos CUDA backend on Summit but when I > run my code with multiple MPI tasks I get the following error: > 0 TS dt 1e-12 time 0. > errno 14 pid 864558 > xolotl: > /__SMPI_build_dir__________________________/ibmsrc/pami/ibm-pami/buildtools/pami_build_port/../pami/components/devices/shmem/shaddr/CMAShaddr.h:164: > size_t PAMI::Dev > ice::Shmem::CMAShaddr::read_impl(PAMI::Memregion*, size_t, > PAMI::Memregion*, size_t, size_t, bool*): Assertion `cbytes > 0' failed. > errno 14 pid 864557 > xolotl: > /__SMPI_build_dir__________________________/ibmsrc/pami/ibm-pami/buildtools/pami_build_port/../pami/components/devices/shmem/shaddr/CMAShaddr.h:164: > size_t PAMI::Dev > ice::Shmem::CMAShaddr::read_impl(PAMI::Memregion*, size_t, > PAMI::Memregion*, size_t, size_t, bool*): Assertion `cbytes > 0' failed. > [e28n07:864557] *** Process received signal *** > [e28n07:864557] Signal: Aborted (6) > [e28n07:864557] Signal code: (-6) > [e28n07:864557] [ 0] > linux-vdso64.so.1(__kernel_sigtramp_rt64+0x0)[0x2000000604d8] > [e28n07:864557] [ 1] /lib64/glibc-hwcaps/power9/libc-2.28.so > (gsignal+0xd8)[0x200005d796f8] > [e28n07:864557] [ 2] /lib64/glibc-hwcaps/power9/libc-2.28.so > (abort+0x164)[0x200005d53ff4] > [e28n07:864557] [ 3] /lib64/glibc-hwcaps/power9/libc-2.28.so > (+0x3d280)[0x200005d6d280] > [e28n07:864557] [ 4] [e28n07:864558] *** Process received signal *** > [e28n07:864558] Signal: Aborted (6) > [e28n07:864558] Signal code: (-6) > [e28n07:864558] [ 0] > linux-vdso64.so.1(__kernel_sigtramp_rt64+0x0)[0x2000000604d8] > [e28n07:864558] [ 1] /lib64/glibc-hwcaps/power9/libc-2.28.so > (gsignal+0xd8)[0x200005d796f8] > [e28n07:864558] [ 2] /lib64/glibc-hwcaps/power9/libc-2.28.so > (abort+0x164)[0x200005d53ff4] > [e28n07:864558] [ 3] /lib64/glibc-hwcaps/power9/libc-2.28.so > (+0x3d280)[0x200005d6d280] > [e28n07:864558] [ 4] /lib64/glibc-hwcaps/power9/libc-2.28.so > (__assert_fail+0x64)[0x200005d6d324] > [e28n07:864557] [ 5] > /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 > > (_ZN4PAMI8Protocol3Get7GetRdmaINS_6Device5Shmem8DmaModelINS3_11ShmemDeviceINS_4Fifo8WrapFifoINS7_10FifoPacketILj64ELj4096EEENS_7Counter15IndirectBoundedINS_6Atomic12NativeAt > > omicEEELj256EEENSB_8IndirectINSB_6NativeEEENS4_9CMAShaddrELj256ELj512EEELb0EEESL_E6simpleEP18pami_rget_simple_t+0x1d8)[0x20007f3971d8] > [e28n07:864557] [ 6] > /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 > > (_ZN4PAMI8Protocol3Get13CompositeRGetINS1_4RGetES3_E6simpleEP18pami_rget_simple_t+0x40)[0x20007f2ecc10] > [e28n07:864557] [ 7] > /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 > (_ZN4PAMI7Context9rget_implEP18pami_rget_simple_t+0x28c)[0x20007f31a78c] > [e28n07:864557] [ 8] > /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 > (PAMI_Rget+0x18)[0x20007f2d94a8] > [e28n07:864557] [ 9] > /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/spectrum_mpi/mca_pml_p > ami.so(process_rndv_msg+0x46c)[0x2000a80159ac] > [e28n07:864557] [10] > /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/spectrum_mpi/mca_pml_p > ami.so(pml_pami_recv_rndv_cb+0x2bc)[0x2000a801670c] > [e28n07:864557] [11] /lib64/glibc-hwcaps/power9/libc-2.28.so > (__assert_fail+0x64)[0x200005d6d324] > [e28n07:864558] [ 5] > /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 > > (_ZN4PAMI8Protocol3Get7GetRdmaINS_6Device5Shmem8DmaModelINS3_11ShmemDeviceINS_4Fifo8WrapFifoINS7_10FifoPacketILj64ELj4096EEENS_7Counter15IndirectBoundedINS_6Atomic12NativeAt > > omicEEELj256EEENSB_8IndirectINSB_6NativeEEENS4_9CMAShaddrELj256ELj512EEELb0EEESL_E6simpleEP18pami_rget_simple_t+0x1d8)[0x20007f3971d8] > [e28n07:864558] [ 6] > /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 > > (_ZN4PAMI8Protocol3Get13CompositeRGetINS1_4RGetES3_E6simpleEP18pami_rget_simple_t+0x40)[0x20007f2ecc10] > [e28n07:864558] [ 7] > /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 > (_ZN4PAMI7Context9rget_implEP18pami_rget_simple_t+0x28c)[0x20007f31a78c] > [e28n07:864558] [ 8] > /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 > (PAMI_Rget+0x18)[0x20007f2d94a8] > [e28n07:864558] [ 9] > /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/spectrum_mpi/mca_pml_p > ami.so(process_rndv_msg+0x46c)[0x2000a80159ac] > [e28n07:864558] [10] > /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/spectrum_mpi/mca_pml_p > ami.so(pml_pami_recv_rndv_cb+0x2bc)[0x2000a801670c] > [e28n07:864558] [11] > /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 > > (_ZN4PAMI8Protocol4Send11EagerSimpleINS_6Device5Shmem11PacketModelINS3_11ShmemDeviceINS_4Fifo8WrapFifoINS7_10FifoPacketILj64ELj4096EEENS_7Counter15IndirectBoundedINS_6Atomic > > 12NativeAtomicEEELj256EEENSB_8IndirectINSB_6NativeEEENS4_9CMAShaddrELj256ELj512EEEEELNS1_15configuration_tE5EE15dispatch_packedEPvSP_mSP_SP_+0x4c)[0x20007f2e30ac] > [e28n07:864557] [12] > /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 > (PAMI_Context_advancev+0x6b0)[0x20007f2da540] > [e28n07:864557] [13] > /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/spectrum_mpi/mca_pml_p > ami.so(mca_pml_pami_progress+0x34)[0x2000a80073e4] > [e28n07:864557] [14] > /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/libopen-pal.so.3(opal_ > progress+0x6c)[0x20003d60640c] > [e28n07:864557] [15] > /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/libmpi_ibm.so.3(ompi_r > equest_default_wait_all+0x144)[0x2000034c4b04] > [e28n07:864557] [16] > /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/libmpi_ibm.so.3(PMPI_W > aitall+0x10c)[0x20000352790c] > [e28n07:864557] [17] > /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 > > (_ZN4PAMI8Protocol4Send11EagerSimpleINS_6Device5Shmem11PacketModelINS3_11ShmemDeviceINS_4Fifo8WrapFifoINS7_10FifoPacketILj64ELj4096EEENS_7Counter15IndirectBoundedINS_6Atomic > > 12NativeAtomicEEELj256EEENSB_8IndirectINSB_6NativeEEENS4_9CMAShaddrELj256ELj512EEEEELNS1_15configuration_tE5EE15dispatch_packedEPvSP_mSP_SP_+0x4c)[0x20007f2e30ac] > [e28n07:864558] [12] > /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 > (PAMI_Context_advancev+0x6b0)[0x20007f2da540] > [e28n07:864558] [13] > /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/spectrum_mpi/mca_pml_p > ami.so(mca_pml_pami_progress+0x34)[0x2000a80073e4] > [e28n07:864558] [14] > /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/libopen-pal.so.3(opal_ > progress+0x6c)[0x20003d60640c] > [e28n07:864558] [15] > /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/libmpi_ibm.so.3(ompi_r > equest_default_wait_all+0x144)[0x2000034c4b04] > [e28n07:864558] [16] > /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/libmpi_ibm.so.3(PMPI_W > aitall+0x10c)[0x20000352790c] > [e28n07:864558] [17] > /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x3ca7b0)[0x2000004ea7b0] > [e28n07:864557] [18] > /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x3ca7b0)[0x2000004ea7b0] > [e28n07:864558] [18] > /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x3c5e68)[0x2000004e5e68] > [e28n07:864557] [19] > /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x3c5e68)[0x2000004e5e68] > [e28n07:864558] [19] > /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(PetscSFBcastEnd+0x74)[0x2000004c9214] > [e28n07:864557] [20] > /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(PetscSFBcastEnd+0x74)[0x2000004c9214] > [e28n07:864558] [20] > /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x3b4cb0)[0x2000004d4cb0] > [e28n07:864557] [21] > /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x3b4cb0)[0x2000004d4cb0] > [e28n07:864558] [21] > /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(VecScatterEnd+0x178)[0x2000004dd038] > [e28n07:864558] [22] > /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(VecScatterEnd+0x178)[0x2000004dd038] > [e28n07:864557] [22] > /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x1112be0)[0x200001232be0] > [e28n07:864558] [23] > /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x1112be0)[0x200001232be0] > [e28n07:864557] [23] > /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(DMGlobalToLocalEnd+0x470)[0x200000e9b0f0] > [e28n07:864557] [24] > /gpfs/alpine2/mat267/proj-shared/code/xolotl-stable-cuda/xolotl/solver/libxolotlSolver.so(_ZN6xolotl6solver11PetscSolver11rhsFunctionEP5_p_TSdP6_p_VecS5 > _+0xc4)[0x200005f710d4] > [e28n07:864557] [25] > /gpfs/alpine2/mat267/proj-shared/code/xolotl-stable-cuda/xolotl/solver/libxolotlSolver.so(_ZN6xolotl6solver11RHSFunctionEP5_p_TSdP6_p_VecS4_Pv+0x2c)[0x2 > 00005f7130c] > [e28n07:864557] [26] > /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(DMGlobalToLocalEnd+0x470)[0x200000e9b0f0] > [e28n07:864558] [24] > /gpfs/alpine2/mat267/proj-shared/code/xolotl-stable-cuda/xolotl/solver/libxolotlSolver.so(_ZN6xolotl6solver11PetscSolver11rhsFunctionEP5_p_TSdP6_p_VecS5 > _+0xc4)[0x200005f710d4] > [e28n07:864558] [25] > /gpfs/alpine2/mat267/proj-shared/code/xolotl-stable-cuda/xolotl/solver/libxolotlSolver.so(_ZN6xolotl6solver11RHSFunctionEP5_p_TSdP6_p_VecS4_Pv+0x2c)[0x2 > 00005f7130c] > [e28n07:864558] [26] > /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(TSComputeRHSFunction+0x1bc)[0x2000017621dc] > [e28n07:864557] [27] > /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(TSComputeRHSFunction+0x1bc)[0x2000017621dc] > [e28n07:864558] [27] > /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(TSComputeIFunction+0x418)[0x200001763ad8] > [e28n07:864557] [28] > /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(TSComputeIFunction+0x418)[0x200001763ad8] > [e28n07:864558] [28] > /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x16f2ef0)[0x200001812ef0] > [e28n07:864557] [29] > /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x16f2ef0)[0x200001812ef0] > [e28n07:864558] [29] > /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(TSStep+0x228)[0x200001768088] > [e28n07:864557] *** End of error message *** > > /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(TSStep+0x228)[0x200001768088] > [e28n07:864558] *** End of error message *** > > It seems to be pointing to > https://urldefense.us/v3/__https://petsc.org/release/manualpages/PetscSF/PetscSFBcastEnd/__;!!G_uCfscf7eWS!dWQ1cCpmozMz4HPnFYCP7THRdg2r3s_6eD0IHbiJcn-3jWT-gNsmtjpP6h0x9jLoOdiQMrZ1wRI-83YJw6XnfuTiguK8$ > > so I wanted to check if you had seen this type of error before and if it > could be related to how the code is compiled or run. Let me know if I can > provide any additional information. > > Best, > > Sophie > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fab4100 at posteo.ch Sat Mar 2 05:40:17 2024 From: fab4100 at posteo.ch (Fabian Wermelinger) Date: Sat, 2 Mar 2024 11:40:17 +0000 Subject: [petsc-users] Clarification for use of MatMPIBAIJSetPreallocationCSR In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From lzou at anl.gov Sun Mar 3 10:42:19 2024 From: lzou at anl.gov (Zou, Ling) Date: Sun, 3 Mar 2024 16:42:19 +0000 Subject: [petsc-users] FW: 'Preconditioning' with lower-order method In-Reply-To: References: Message-ID: Original email may have been sent to the incorrect place. See below. -Ling From: Zou, Ling Date: Sunday, March 3, 2024 at 10:34 AM To: petsc-users Subject: 'Preconditioning' with lower-order method Hi all, I am solving a PDE system over a spatial domain. Numerical methods are: * Finite volume method (both 1st and 2nd order implemented) * BDF1 and BDF2 for time integration. What I have noticed is that 1st order FVM converges much faster than 2nd order FVM, regardless the time integration scheme. Well, not surprising since 2nd order FVM introduces additional non-linearity. I?m thinking about two possible ways to speed up 2nd order FVM, and would like to get some thoughts or community knowledge before jumping into code implementation. Say, let the 2nd order FVM residual function be F2(x) = 0; and the 1st order FVM residual function be F1(x) = 0. 1. Option ? 1, multi-step for each time step Step 1: solving F1(x) = 0 to obtain a temporary solution x1 Step 2: feed x1 as an initial guess to solve F2(x) = 0 to obtain the final solution. [Not sure if gain any saving at all] 1. Option -2, dynamically changing residual function F(x) In pseudo code, would be something like. snesFormFunction(SNES snes, Vec u, Vec f, void *) { if (snes.nl_it_no < 4) // 4 being arbitrary here f = F1(u); else f = F2(u); } I know this might be a bit crazy since it may crash after switching residual function, still, any thoughts? Best, -Ling -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Sun Mar 3 11:09:44 2024 From: mfadams at lbl.gov (Mark Adams) Date: Sun, 3 Mar 2024 12:09:44 -0500 Subject: [petsc-users] FW: 'Preconditioning' with lower-order method In-Reply-To: References: Message-ID: On Sun, Mar 3, 2024 at 11:42?AM Zou, Ling via petsc-users < petsc-users at mcs.anl.gov> wrote: > Original email may have been sent to the incorrect place. > > See below. > > > > -Ling > > > > *From: *Zou, Ling > *Date: *Sunday, March 3, 2024 at 10:34 AM > *To: *petsc-users > *Subject: *'Preconditioning' with lower-order method > > Hi all, > > > > I am solving a PDE system over a spatial domain. Numerical methods are: > > - Finite volume method (both 1st and 2nd order implemented) > - BDF1 and BDF2 for time integration. > > What I have noticed is that 1st order FVM converges much faster than 2nd > order FVM, regardless the time integration scheme. Well, not surprising > since 2nd order FVM introduces additional non-linearity. > > > > I?m thinking about two possible ways to speed up 2nd order FVM, and would > like to get some thoughts or community knowledge before jumping into code > implementation. > > > > Say, let the 2nd order FVM residual function be *F*2(*x*) = 0; and the 1st > order FVM residual function be *F*1(*x*) = 0. > > 1. Option ? 1, multi-step for each time step > > Step 1: solving *F*1(*x*) = 0 to obtain a temporary solution * x*1 > > Step 2: feed *x*1 as an initial guess to solve *F*2(*x*) = 0 to obtain > the final solution. > > [Not sure if gain any saving at all] > > > > 1. Option -2, dynamically changing residual function F(x) > > In pseudo code, would be something like. > > > You can try it. I would doubt the linear version (1) would help. This is similar to "defect correction" but not the same. The nonlinear version (2) is something you could try. I've seen people switch nonlinear solvers like this but not operators. You could try it. Mark > snesFormFunction(SNES snes, Vec u, Vec f, void *) > > { > > if (snes.nl_it_no < 4) // 4 being arbitrary here > > f = F1(u); > > else > > f = F2(u); > > } > > > > I know this might be a bit crazy since it may crash after switching > residual function, still, any thoughts? > > > > Best, > > > > -Ling > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lzou at anl.gov Sun Mar 3 11:15:25 2024 From: lzou at anl.gov (Zou, Ling) Date: Sun, 3 Mar 2024 17:15:25 +0000 Subject: [petsc-users] FW: 'Preconditioning' with lower-order method In-Reply-To: References: Message-ID: Thank you, Mark. This is encouraging! I will give it a try and report back. -Ling From: Mark Adams Date: Sunday, March 3, 2024 at 11:10 AM To: Zou, Ling Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] FW: 'Preconditioning' with lower-order method On Sun, Mar 3, 2024 at 11:?42 AM Zou, Ling via petsc-users wrote: Original email may have been sent to the incorrect place. See below. -Ling From: Zou, Ling Date: Sunday, March 3, 2024 at ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd On Sun, Mar 3, 2024 at 11:42?AM Zou, Ling via petsc-users > wrote: Original email may have been sent to the incorrect place. See below. -Ling From: Zou, Ling > Date: Sunday, March 3, 2024 at 10:34 AM To: petsc-users > Subject: 'Preconditioning' with lower-order method Hi all, I am solving a PDE system over a spatial domain. Numerical methods are: * Finite volume method (both 1st and 2nd order implemented) * BDF1 and BDF2 for time integration. What I have noticed is that 1st order FVM converges much faster than 2nd order FVM, regardless the time integration scheme. Well, not surprising since 2nd order FVM introduces additional non-linearity. I?m thinking about two possible ways to speed up 2nd order FVM, and would like to get some thoughts or community knowledge before jumping into code implementation. Say, let the 2nd order FVM residual function be F2(x) = 0; and the 1st order FVM residual function be F1(x) = 0. 1. Option ? 1, multi-step for each time step Step 1: solving F1(x) = 0 to obtain a temporary solution x1 Step 2: feed x1 as an initial guess to solve F2(x) = 0 to obtain the final solution. [Not sure if gain any saving at all] 1. Option -2, dynamically changing residual function F(x) In pseudo code, would be something like. You can try it. I would doubt the linear version (1) would help. This is similar to "defect correction" but not the same. The nonlinear version (2) is something you could try. I've seen people switch nonlinear solvers like this but not operators. You could try it. Mark snesFormFunction(SNES snes, Vec u, Vec f, void *) { if (snes.nl_it_no < 4) // 4 being arbitrary here f = F1(u); else f = F2(u); } I know this might be a bit crazy since it may crash after switching residual function, still, any thoughts? Best, -Ling -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Sun Mar 3 12:06:21 2024 From: bsmith at petsc.dev (Barry Smith) Date: Sun, 3 Mar 2024 13:06:21 -0500 Subject: [petsc-users] 'Preconditioning' with lower-order method In-Reply-To: References: Message-ID: Are you forming the Jacobian for the first and second order cases inside of Newton? You can run both with -log_view to see how much time is spent in the various events (compute function, compute Jacobian, linear solve, ...) for the two cases and compare them. > On Mar 3, 2024, at 11:42?AM, Zou, Ling via petsc-users wrote: > > Original email may have been sent to the incorrect place. > See below. > > -Ling > > From: Zou, Ling > > Date: Sunday, March 3, 2024 at 10:34 AM > To: petsc-users > > Subject: 'Preconditioning' with lower-order method > > Hi all, > > I am solving a PDE system over a spatial domain. Numerical methods are: > Finite volume method (both 1st and 2nd order implemented) > BDF1 and BDF2 for time integration. > What I have noticed is that 1st order FVM converges much faster than 2nd order FVM, regardless the time integration scheme. Well, not surprising since 2nd order FVM introduces additional non-linearity. > > I?m thinking about two possible ways to speed up 2nd order FVM, and would like to get some thoughts or community knowledge before jumping into code implementation. > > Say, let the 2nd order FVM residual function be F2(x) = 0; and the 1st order FVM residual function be F1(x) = 0. > Option ? 1, multi-step for each time step > Step 1: solving F1(x) = 0 to obtain a temporary solution x1 > Step 2: feed x1 as an initial guess to solve F2(x) = 0 to obtain the final solution. > [Not sure if gain any saving at all] > > Option -2, dynamically changing residual function F(x) > In pseudo code, would be something like. > > snesFormFunction(SNES snes, Vec u, Vec f, void *) > { > if (snes.nl_it_no < 4) // 4 being arbitrary here > f = F1(u); > else > f = F2(u); > } > > I know this might be a bit crazy since it may crash after switching residual function, still, any thoughts? > > Best, > > -Ling -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Sun Mar 3 16:07:40 2024 From: bsmith at petsc.dev (Barry Smith) Date: Sun, 3 Mar 2024 17:07:40 -0500 Subject: [petsc-users] Clarification for use of MatMPIBAIJSetPreallocationCSR In-Reply-To: References: Message-ID: <859A5696-CFA6-4D25-A06D-37C1C07D989E@petsc.dev> Clarify in documentation which routines copy the provided values https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7336__;!!G_uCfscf7eWS!Z2j2ynvPrOvIKRuhQdogedQsz8yANmf4jbsUEXvXz2afujWJLfsAdG45b-wsInVpXM3tB3H6h1vTdYbzamBV5hs$ > On Mar 2, 2024, at 6:40?AM, Fabian Wermelinger wrote: > > This Message Is From an External Sender > This message came from outside your organization. > On Fri, 01 Mar 2024 12:51:45 -0600, Junchao Zhang wrote: > >> The preferred method then is to use MatSetValues() for the copies (or > >> possibly MatSetValuesRow() or MatSetValuesBlocked())? > > MatSetValuesRow() > > Thanks > > >> The term "Preallocation" is confusing to me. For example, > >> MatCreateMPIBAIJWithArrays clearly states in the doc that arrays are copied > >> (https://urldefense.us/v3/__https://petsc.org/release/manualpages/Mat/MatCreateMPIBAIJWithArrays/__;!!G_uCfscf7eWS!cls7psuvaRuDYuXpUpmU7lcEhX9AO0bb3qTpszwuTNP8LPPrJkCzoaHIdJCzxPR36D7SLYs9MKCxqMBKRnSQc8s$), I > >> would then assume PETSc maintains internal storage for it. If something is > >> preallocated, I would not make that assumption. > >"Preallocation" in petsc means "tell petsc sizes of rows in a matrix", so that > >petsc can preallocate the memory before you do MatSetValues(). This is clearer > >in https://urldefense.us/v3/__https://petsc.org/release/manualpages/Mat/MatMPIAIJSetPreallocation/__;!!G_uCfscf7eWS!cls7psuvaRuDYuXpUpmU7lcEhX9AO0bb3qTpszwuTNP8LPPrJkCzoaHIdJCzxPR36D7SLYs9MKCxqMBKrYGemdo$ > > Thanks for the clarification! > > All best, > > -- > fabs -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Sun Mar 3 21:24:14 2024 From: jed at jedbrown.org (Jed Brown) Date: Sun, 03 Mar 2024 20:24:14 -0700 Subject: [petsc-users] FW: 'Preconditioning' with lower-order method In-Reply-To: References: Message-ID: <87il22wug1.fsf@jedbrown.org> An HTML attachment was scrubbed... URL: From lzou at anl.gov Sun Mar 3 23:22:29 2024 From: lzou at anl.gov (Zou, Ling) Date: Mon, 4 Mar 2024 05:22:29 +0000 Subject: [petsc-users] 'Preconditioning' with lower-order method In-Reply-To: References: Message-ID: Barry, thank you. I am not sure if I exactly follow you on this: ?Are you forming the Jacobian for the first and second order cases inside of Newton?? The problem that we deal with, heat/mass transfer in heterogeneous systems (reactor system), is generally small in terms of size, i.e., # of DOFs (several k to maybe 100k level), so for now, I completely rely on PETSc to compute Jacobian, i.e., finite-differencing. That?s a good suggestion to see the time spent during various events. What motivated me to try the options are the following observations. 2nd order FVM: Time Step 149, time = 13229.7, dt = 100 NL Step = 0, fnorm = 7.80968E-03 NL Step = 1, fnorm = 7.65731E-03 NL Step = 2, fnorm = 6.85034E-03 NL Step = 3, fnorm = 6.11873E-03 NL Step = 4, fnorm = 1.57347E-03 NL Step = 5, fnorm = 9.03536E-04 Solve Converged! 1st order FVM: Time Step 149, time = 13229.7, dt = 100 NL Step = 0, fnorm = 7.90072E-03 NL Step = 1, fnorm = 2.01919E-04 NL Step = 2, fnorm = 1.06960E-05 NL Step = 3, fnorm = 2.41683E-09 Solve Converged! Notice the obvious ?stagnant? in residual for the 2nd order method while not in the 1st order. For the same problem, the wall time is 10 sec vs 6 sec. I would be happy if I can reduce 2 sec for the 2nd order method. -Ling From: Barry Smith Date: Sunday, March 3, 2024 at 12:06 PM To: Zou, Ling Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] 'Preconditioning' with lower-order method Are you forming the Jacobian for the first and second order cases inside of Newton? You can run both with -log_view to see how much time is spent in the various events (compute function, compute Jacobian, linear solve, ..?.?) for the two cases ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Are you forming the Jacobian for the first and second order cases inside of Newton? You can run both with -log_view to see how much time is spent in the various events (compute function, compute Jacobian, linear solve, ...) for the two cases and compare them. On Mar 3, 2024, at 11:42?AM, Zou, Ling via petsc-users wrote: Original email may have been sent to the incorrect place. See below. -Ling From: Zou, Ling > Date: Sunday, March 3, 2024 at 10:34 AM To: petsc-users > Subject: 'Preconditioning' with lower-order method Hi all, I am solving a PDE system over a spatial domain. Numerical methods are: * Finite volume method (both 1st and 2nd order implemented) * BDF1 and BDF2 for time integration. What I have noticed is that 1st order FVM converges much faster than 2nd order FVM, regardless the time integration scheme. Well, not surprising since 2nd order FVM introduces additional non-linearity. I?m thinking about two possible ways to speed up 2nd order FVM, and would like to get some thoughts or community knowledge before jumping into code implementation. Say, let the 2nd order FVM residual function be F2(x) = 0; and the 1st order FVM residual function be F1(x) = 0. 1. Option ? 1, multi-step for each time step Step 1: solving F1(x) = 0 to obtain a temporary solution x1 Step 2: feed x1 as an initial guess to solve F2(x) = 0 to obtain the final solution. [Not sure if gain any saving at all] 1. Option -2, dynamically changing residual function F(x) In pseudo code, would be something like. snesFormFunction(SNES snes, Vec u, Vec f, void *) { if (snes.nl_it_no < 4) // 4 being arbitrary here f = F1(u); else f = F2(u); } I know this might be a bit crazy since it may crash after switching residual function, still, any thoughts? Best, -Ling -------------- next part -------------- An HTML attachment was scrubbed... URL: From lzou at anl.gov Sun Mar 3 23:32:13 2024 From: lzou at anl.gov (Zou, Ling) Date: Mon, 4 Mar 2024 05:32:13 +0000 Subject: [petsc-users] FW: 'Preconditioning' with lower-order method In-Reply-To: <87il22wug1.fsf@jedbrown.org> References: <87il22wug1.fsf@jedbrown.org> Message-ID: From: Jed Brown Date: Sunday, March 3, 2024 at 9:24 PM To: Zou, Ling , petsc-users at mcs.anl.gov Subject: Re: [petsc-users] FW: 'Preconditioning' with lower-order method One option is to form the preconditioner using the FV1 method, which is sparser and satisfies h-ellipticity, while using FV2 for the residual and (optionally) for matrix-free operator application. FV1 is a highly diffusive method so in a sense, ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd One option is to form the preconditioner using the FV1 method, which is sparser and satisfies h-ellipticity, while using FV2 for the residual and (optionally) for matrix-free operator application. <<< In terms of code implementation, this seems a bit tricky to me. Looks to me that I have to know exactly who is calling the residual function, if its MF operation, using FV2, while if finite-differencing for Jacobian, using FV1. Currently, I don?t know how to do it. Another thing I?d like to mention is that the linear solver has never really been an issue. While the non-linear solver for the FV2 scheme often ?stagnant? at the first a couple of non-linear iteration [see the other email reply to Barry]. Seems to me, the additional nonlinearity from the TVD limiter causing difficulty to PETSc to find the attraction zone. FV1 is a highly diffusive method so in a sense, it's much less faithful to the physics and (say, in the case of fluids) similar to a much lower-Reynolds number (if you use a modified equation analysis to work out the effective Reynolds number in the presence of the numerical diffusion). It's good to put some thought into your choice of limiter. Note that intersection of second order and TVD methods leads to mandatory nonsmoothness (discontinuous derivatives). <<< Yeah? I am afraid that the TVD limiter is the issue, so that?s the reason I?d try to use FV1 to bring the solution (hopefully) closer to the real solution so the non-linear solver has an easy job to do. "Zou, Ling via petsc-users" writes: > Original email may have been sent to the incorrect place. > See below. > > -Ling > > From: Zou, Ling > Date: Sunday, March 3, 2024 at 10:34 AM > To: petsc-users > Subject: 'Preconditioning' with lower-order method > Hi all, > > I am solving a PDE system over a spatial domain. Numerical methods are: > > * Finite volume method (both 1st and 2nd order implemented) > * BDF1 and BDF2 for time integration. > What I have noticed is that 1st order FVM converges much faster than 2nd order FVM, regardless the time integration scheme. Well, not surprising since 2nd order FVM introduces additional non-linearity. > > I?m thinking about two possible ways to speed up 2nd order FVM, and would like to get some thoughts or community knowledge before jumping into code implementation. > > Say, let the 2nd order FVM residual function be F2(x) = 0; and the 1st order FVM residual function be F1(x) = 0. > > 1. Option ? 1, multi-step for each time step > Step 1: solving F1(x) = 0 to obtain a temporary solution x1 > Step 2: feed x1 as an initial guess to solve F2(x) = 0 to obtain the final solution. > [Not sure if gain any saving at all] > > > 1. Option -2, dynamically changing residual function F(x) > > In pseudo code, would be something like. > > > > snesFormFunction(SNES snes, Vec u, Vec f, void *) > > { > > if (snes.nl_it_no < 4) // 4 being arbitrary here > > f = F1(u); > > else > > f = F2(u); > > } > > > > I know this might be a bit crazy since it may crash after switching residual function, still, any thoughts? > > Best, > > -Ling -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Sun Mar 3 23:34:51 2024 From: jed at jedbrown.org (Jed Brown) Date: Sun, 03 Mar 2024 22:34:51 -0700 Subject: [petsc-users] 'Preconditioning' with lower-order method In-Reply-To: References: Message-ID: <8734t6woec.fsf@jedbrown.org> An HTML attachment was scrubbed... URL: From lzou at anl.gov Sun Mar 3 23:48:12 2024 From: lzou at anl.gov (Zou, Ling) Date: Mon, 4 Mar 2024 05:48:12 +0000 Subject: [petsc-users] 'Preconditioning' with lower-order method In-Reply-To: <8734t6woec.fsf@jedbrown.org> References: <8734t6woec.fsf@jedbrown.org> Message-ID: From: Jed Brown Date: Sunday, March 3, 2024 at 11:35 PM To: Zou, Ling , Barry Smith Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] 'Preconditioning' with lower-order method If you're having PETSc use coloring and have confirmed that the stencil is sufficient, then it would be nonsmoothness (again, consider the limiter you've chosen) preventing quadratic convergence (assuming that doesn't kick in eventually). Note ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd If you're having PETSc use coloring and have confirmed that the stencil is sufficient, then it would be nonsmoothness (again, consider the limiter you've chosen) preventing quadratic convergence (assuming that doesn't kick in eventually). ? Yes, I do use coloring, and I do provide sufficient stencil, i.e., neighbor?s neighbor. The sufficiency is confirmed by PETSc?s -snes_test_jacobian and -snes_test_jacobian_view options. Note that assembling a Jacobian of a second order TVD operator requires at least second neighbors while the first order needs only first neighbors, thus is much sparser and needs fewer colors to compute. ? In my code implementation, when marking the Jacobian nonzero pattern, I don?t differentiate FV1 or FV2, I always use the FV2 stencil, so it?s a bit ?fat? for the FV1 method, but worked just fine. I expect you're either not exploiting that in the timings or something else is amiss. You can run with `-log_view -snes_view -ksp_converged_reason` to get a bit more information about what's happening. ? The attached is screen output as you suggest. The linear and nonlinear performance of FV2 is both worse from the output. FV2: Time Step 149, time = 13229.7, dt = 100 NL Step = 0, fnorm = 7.80968E-03 Linear solve converged due to CONVERGED_RTOL iterations 26 NL Step = 1, fnorm = 7.65731E-03 Linear solve converged due to CONVERGED_RTOL iterations 24 NL Step = 2, fnorm = 6.85034E-03 Linear solve converged due to CONVERGED_RTOL iterations 27 NL Step = 3, fnorm = 6.11873E-03 Linear solve converged due to CONVERGED_RTOL iterations 25 NL Step = 4, fnorm = 1.57347E-03 Linear solve converged due to CONVERGED_RTOL iterations 27 NL Step = 5, fnorm = 9.03536E-04 SNES Object: 1 MPI process type: newtonls maximum iterations=20, maximum function evaluations=10000 tolerances: relative=1e-08, absolute=1e-06, solution=1e-08 total number of linear solver iterations=129 total number of function evaluations=144 norm schedule ALWAYS Jacobian is applied matrix-free with differencing Preconditioning Jacobian is built using finite differences with coloring SNESLineSearch Object: 1 MPI process type: bt interpolation: cubic alpha=1.000000e-04 maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI process type: gmres restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=100, initial guess is zero tolerances: relative=0.0001, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: ilu out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift to prevent zero pivot [NONZERO] matrix ordering: rcm factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI process type: seqaij rows=8715, cols=8715 package used to perform factorization: petsc total: nonzeros=38485, allocated nonzeros=38485 not using I-node routines linear system matrix followed by preconditioner matrix: Mat Object: 1 MPI process type: mffd rows=8715, cols=8715 Matrix-free approximation: err=1.49012e-08 (relative error in function evaluation) Using wp compute h routine Does not compute normU Mat Object: 1 MPI process type: seqaij rows=8715, cols=8715 total: nonzeros=38485, allocated nonzeros=38485 total number of mallocs used during MatSetValues calls=0 not using I-node routines Solve Converged! FV1: Time Step 149, time = 13229.7, dt = 100 NL Step = 0, fnorm = 7.90072E-03 Linear solve converged due to CONVERGED_RTOL iterations 12 NL Step = 1, fnorm = 2.01919E-04 Linear solve converged due to CONVERGED_RTOL iterations 17 NL Step = 2, fnorm = 1.06960E-05 Linear solve converged due to CONVERGED_RTOL iterations 15 NL Step = 3, fnorm = 2.41683E-09 SNES Object: 1 MPI process type: newtonls maximum iterations=20, maximum function evaluations=10000 tolerances: relative=1e-08, absolute=1e-06, solution=1e-08 total number of linear solver iterations=44 total number of function evaluations=51 norm schedule ALWAYS Jacobian is applied matrix-free with differencing Preconditioning Jacobian is built using finite differences with coloring SNESLineSearch Object: 1 MPI process type: bt interpolation: cubic alpha=1.000000e-04 maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI process type: gmres restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=100, initial guess is zero tolerances: relative=0.0001, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: ilu out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift to prevent zero pivot [NONZERO] matrix ordering: rcm factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI process type: seqaij rows=8715, cols=8715 package used to perform factorization: petsc total: nonzeros=38485, allocated nonzeros=38485 not using I-node routines linear system matrix followed by preconditioner matrix: Mat Object: 1 MPI process type: mffd rows=8715, cols=8715 Matrix-free approximation: err=1.49012e-08 (relative error in function evaluation) Using wp compute h routine Does not compute normU Mat Object: 1 MPI process type: seqaij rows=8715, cols=8715 total: nonzeros=38485, allocated nonzeros=38485 total number of mallocs used during MatSetValues calls=0 not using I-node routines Solve Converged! "Zou, Ling via petsc-users" writes: > Barry, thank you. > I am not sure if I exactly follow you on this: > ?Are you forming the Jacobian for the first and second order cases inside of Newton?? > > The problem that we deal with, heat/mass transfer in heterogeneous systems (reactor system), is generally small in terms of size, i.e., # of DOFs (several k to maybe 100k level), so for now, I completely rely on PETSc to compute Jacobian, i.e., finite-differencing. > > That?s a good suggestion to see the time spent during various events. > What motivated me to try the options are the following observations. > > 2nd order FVM: > > Time Step 149, time = 13229.7, dt = 100 > > NL Step = 0, fnorm = 7.80968E-03 > > NL Step = 1, fnorm = 7.65731E-03 > > NL Step = 2, fnorm = 6.85034E-03 > > NL Step = 3, fnorm = 6.11873E-03 > > NL Step = 4, fnorm = 1.57347E-03 > > NL Step = 5, fnorm = 9.03536E-04 > > Solve Converged! > > 1st order FVM: > > Time Step 149, time = 13229.7, dt = 100 > > NL Step = 0, fnorm = 7.90072E-03 > > NL Step = 1, fnorm = 2.01919E-04 > > NL Step = 2, fnorm = 1.06960E-05 > > NL Step = 3, fnorm = 2.41683E-09 > > Solve Converged! > > Notice the obvious ?stagnant? in residual for the 2nd order method while not in the 1st order. > For the same problem, the wall time is 10 sec vs 6 sec. I would be happy if I can reduce 2 sec for the 2nd order method. > > -Ling > > From: Barry Smith > Date: Sunday, March 3, 2024 at 12:06 PM > To: Zou, Ling > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] 'Preconditioning' with lower-order method > Are you forming the Jacobian for the first and second order cases inside of Newton? You can run both with -log_view to see how much time is spent in the various events (compute function, compute Jacobian, linear solve, ..?.?) for the two cases > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Are you forming the Jacobian for the first and second order cases inside of Newton? > > You can run both with -log_view to see how much time is spent in the various events (compute function, compute Jacobian, linear solve, ...) for the two cases and compare them. > > > > > On Mar 3, 2024, at 11:42?AM, Zou, Ling via petsc-users wrote: > > Original email may have been sent to the incorrect place. > See below. > > -Ling > > From: Zou, Ling > > Date: Sunday, March 3, 2024 at 10:34 AM > To: petsc-users > > Subject: 'Preconditioning' with lower-order method > Hi all, > > I am solving a PDE system over a spatial domain. Numerical methods are: > > * Finite volume method (both 1st and 2nd order implemented) > * BDF1 and BDF2 for time integration. > What I have noticed is that 1st order FVM converges much faster than 2nd order FVM, regardless the time integration scheme. Well, not surprising since 2nd order FVM introduces additional non-linearity. > > I?m thinking about two possible ways to speed up 2nd order FVM, and would like to get some thoughts or community knowledge before jumping into code implementation. > > Say, let the 2nd order FVM residual function be F2(x) = 0; and the 1st order FVM residual function be F1(x) = 0. > > 1. Option ? 1, multi-step for each time step > Step 1: solving F1(x) = 0 to obtain a temporary solution x1 > Step 2: feed x1 as an initial guess to solve F2(x) = 0 to obtain the final solution. > [Not sure if gain any saving at all] > > > 1. Option -2, dynamically changing residual function F(x) > In pseudo code, would be something like. > > snesFormFunction(SNES snes, Vec u, Vec f, void *) > { > if (snes.nl_it_no < 4) // 4 being arbitrary here > f = F1(u); > else > f = F2(u); > } > > I know this might be a bit crazy since it may crash after switching residual function, still, any thoughts? > > Best, > > -Ling -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Mon Mar 4 11:10:45 2024 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Mon, 4 Mar 2024 11:10:45 -0600 Subject: [petsc-users] PAMI error on Summit In-Reply-To: References: Message-ID: Hi, Sophie, I tried various modules and compilers on Summit and failed to find one that works with gpu aware mpi. The one that could build petsc and kokkos was "module load cuda/11.7.1 gcc/9.3.0-compiler_only spectrum-mpi essl netlib-lapack". But it only worked with "-use_gpu_aware_mpi 0". Without it, I saw code crashes. From what I can see, the gpu-aware mpi on Summit is an unusable and unmaintained state. --Junchao Zhang On Fri, Mar 1, 2024 at 3:58?PM Junchao Zhang wrote: > It is weird, with > jsrun --smpiargs "-gpu" -n 6 -a 1 -c 1 -g 1 /gpfs/alpine2/mat267/proj- > shared/dependencies/petsc-kokkos/src/ksp/ksp/tutorials/bench_kspsolve > -mat_type aijkokkos -use_gpu_aware_mpi 1 > > petsc tried to test if the MPI is gpu aware (by doing an MPI_Allreduce on > device buffers). It tried and found it was not, so it threw out the > complaint in the error message. > > From > https://urldefense.us/v3/__https://docs.olcf.ornl.gov/systems/summit_user_guide.html*cuda-aware-mpi__;Iw!!G_uCfscf7eWS!cq445CXteimKBMZKF1HQqgEFTwREIrbMMm5Cn-sCV3wDm2A3tixBsge_FLfW-3YKRxtbYWK9D29cMq338kMstbOGRUZc$ , > I think your flags were right. > > I just got my Summit account reactivated today. I will give it a try. > > --Junchao Zhang > > > On Fri, Mar 1, 2024 at 3:32?PM Blondel, Sophie wrote: > >> I have been using --smpiargs "-gpu". >> >> I tried the benchmark with "jsrun --smpiargs "-gpu" -n 6 -a 1 -c 1 -g 1 >> /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos/src/ksp/ksp/tutorials/bench_kspsolve >> -mat_type aijkokkos -use_gpu_aware_mpi 0" and it seems to work: >> Fri Mar 1 16:27:14 EST 2024 >> =========================================== >> Test: KSP performance - Poisson >> Input matrix: 27-pt finite difference stencil >> -n 100 >> DoFs = 1000000 >> Number of nonzeros = 26463592 >> >> Step1 - creating Vecs and Mat... >> Step2 - running KSPSolve()... >> Step3 - calculating error norm... >> >> Error norm: 5.591e-02 >> KSP iters: 63 >> KSPSolve: 3.16646 seconds >> FOM: 3.158e+05 DoFs/sec >> =========================================== >> >> ------------------------------------------------------------ >> Sender: LSF System >> Subject: Job 3322694: in cluster Done >> >> Job was submitted from host by user in >> cluster at Fri Mar 1 16:26:58 2024 >> Job was executed on host(s) <1*batch3>, in queue , as user >> in cluster at Fri Mar 1 16:27:00 2024 >> <42*a35n05> >> was used as the home directory. >> was used as the working directory. >> Started at Fri Mar 1 16:27:00 2024 >> Terminated at Fri Mar 1 16:27:26 2024 >> Results reported at Fri Mar 1 16:27:26 2024 >> >> The output (if any) is above this job summary. >> >> >> If I switch to "jsrun --smpiargs "-gpu" -n 6 -a 1 -c 1 -g 1 >> /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos/src/ksp/ksp/tutorials/bench_kspsolve >> -mat_type aijkokkos -use_gpu_aware_mpi 1" it complains: >> Fri Mar 1 16:25:02 EST 2024 >> =========================================== >> Test: KSP performance - Poisson >> Input matrix: 27-pt finite difference stencil >> -n 100 >> DoFs = 1000000 >> Number of nonzeros = 26463592 >> >> Step1 - creating Vecs and Mat... >> [5]PETSC ERROR: PETSc is configured with GPU support, but your MPI is not >> GPU-aware. For better performance, please use a GPU-aware MPI. >> [5]PETSC ERROR: If you do not care, add option -use_gpu_aware_mpi 0. To >> not see the message again, add the option to your .petscrc, OR add it to >> the env var PETSC_OPTIONS. >> [5]PETSC ERROR: If you do care, for IBM Spectrum MPI on OLCF Summit, you >> may need jsrun --smpiargs=-gpu. >> [5]PETSC ERROR: For Open MPI, you need to configure it --with-cuda ( >> https://urldefense.us/v3/__https://www.open-mpi.org/faq/?category=buildcuda__;!!G_uCfscf7eWS!cq445CXteimKBMZKF1HQqgEFTwREIrbMMm5Cn-sCV3wDm2A3tixBsge_FLfW-3YKRxtbYWK9D29cMq338kMstegRM8hj$ ) >> [5]PETSC ERROR: For MVAPICH2-GDR, you need to set MV2_USE_CUDA=1 ( >> https://urldefense.us/v3/__http://mvapich.cse.ohio-state.edu/userguide/gdr/__;!!G_uCfscf7eWS!cq445CXteimKBMZKF1HQqgEFTwREIrbMMm5Cn-sCV3wDm2A3tixBsge_FLfW-3YKRxtbYWK9D29cMq338kMstf6tc3Lv$ ) >> [5]PETSC ERROR: For Cray-MPICH, you need to set >> MPICH_GPU_SUPPORT_ENABLED=1 (man mpi to see manual of cray-mpich) >> -------------------------------------------------------------------------- >> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_SELF >> with errorcode 76. >> >> Best, >> >> Sophie >> ------------------------------ >> *From:* Junchao Zhang >> *Sent:* Thursday, February 29, 2024 17:09 >> *To:* Blondel, Sophie >> *Cc:* xolotl-psi-development at lists.sourceforge.net < >> xolotl-psi-development at lists.sourceforge.net>; petsc-users at mcs.anl.gov < >> petsc-users at mcs.anl.gov> >> *Subject:* Re: [petsc-users] PAMI error on Summit >> >> You don't often get email from junchao.zhang at gmail.com. Learn why this >> is important >> Could you try a petsc example to see if the environment is good? >> For example, >> >> cd src/ksp/ksp/tutorials >> make bench_kspsolve >> mpirun -n 6 ./bench_kspsolve -mat_type aijkokkos -use_gpu_aware_mpi {0 >> or 1} >> >> BTW, I remember to use gpu-aware mpi on Summit, one needs to pass >> --smpiargs "-gpu" to jsrun >> >> --Junchao Zhang >> >> >> On Thu, Feb 29, 2024 at 3:22?PM Blondel, Sophie via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >> I still get the same error when deactivating GPU-aware MPI. I also tried >> unloading spectrum MPI and using openMPI instead (recompiling everything) >> and I get a segfault in PETSc in that case (still using GPU-aware MPI I >> think, at least not explicitly >> ZjQcmQRYFpfptBannerStart >> This Message Is From an External Sender >> This message came from outside your organization. >> >> ZjQcmQRYFpfptBannerEnd >> I still get the same error when deactivating GPU-aware MPI. >> >> I also tried unloading spectrum MPI and using openMPI instead >> (recompiling everything) and I get a segfault in PETSc in that case (still >> using GPU-aware MPI I think, at least not explicitly turning it off): >> >> 0 TS dt 1e-12 time 0. >> >> [ERROR] [0]PETSC ERROR: >> >> [ERROR] >> ------------------------------------------------------------------------ >> >> [ERROR] [0]PETSC ERROR: >> >> [ERROR] Caught signal number 11 SEGV: Segmentation Violation, probably >> memory access out of range >> >> [ERROR] [0]PETSC ERROR: >> >> [ERROR] Try option -start_in_debugger or -on_error_attach_debugger >> >> [ERROR] [0]PETSC ERROR: >> >> [ERROR] or see https://urldefense.us/v3/__https://petsc.org/release/faq/*valgrind__;Iw!!G_uCfscf7eWS!cq445CXteimKBMZKF1HQqgEFTwREIrbMMm5Cn-sCV3wDm2A3tixBsge_FLfW-3YKRxtbYWK9D29cMq338kMstTXwhErY$ >> and >> https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!cq445CXteimKBMZKF1HQqgEFTwREIrbMMm5Cn-sCV3wDm2A3tixBsge_FLfW-3YKRxtbYWK9D29cMq338kMstcIOR87Y$ >> >> >> [ERROR] [0]PETSC ERROR: >> >> [ERROR] or try https://urldefense.us/v3/__https://docs.nvidia.com/cuda/cuda-memcheck/index.html__;!!G_uCfscf7eWS!cq445CXteimKBMZKF1HQqgEFTwREIrbMMm5Cn-sCV3wDm2A3tixBsge_FLfW-3YKRxtbYWK9D29cMq338kMstUmD8idW$ >> on >> NVIDIA CUDA systems to find memory corruption errors >> >> [ERROR] [0]PETSC ERROR: >> >> [ERROR] configure using --with-debugging=yes, recompile, link, and run >> >> [ERROR] [0]PETSC ERROR: >> >> [ERROR] to get more information on the crash. >> >> [ERROR] [0]PETSC ERROR: >> >> [ERROR] Run with -malloc_debug to check if memory corruption is causing >> the crash. >> >> -------------------------------------------------------------------------- >> >> Best, >> >> Sophie >> ------------------------------ >> *From:* Blondel, Sophie via Xolotl-psi-development < >> xolotl-psi-development at lists.sourceforge.net> >> *Sent:* Thursday, February 29, 2024 10:17 >> *To:* xolotl-psi-development at lists.sourceforge.net < >> xolotl-psi-development at lists.sourceforge.net>; petsc-users at mcs.anl.gov < >> petsc-users at mcs.anl.gov> >> *Subject:* [Xolotl-psi-development] PAMI error on Summit >> >> Hi, >> >> I am using PETSc build with the Kokkos CUDA backend on Summit but when I >> run my code with multiple MPI tasks I get the following error: >> 0 TS dt 1e-12 time 0. >> errno 14 pid 864558 >> xolotl: >> /__SMPI_build_dir__________________________/ibmsrc/pami/ibm-pami/buildtools/pami_build_port/../pami/components/devices/shmem/shaddr/CMAShaddr.h:164: >> size_t PAMI::Dev >> ice::Shmem::CMAShaddr::read_impl(PAMI::Memregion*, size_t, >> PAMI::Memregion*, size_t, size_t, bool*): Assertion `cbytes > 0' failed. >> errno 14 pid 864557 >> xolotl: >> /__SMPI_build_dir__________________________/ibmsrc/pami/ibm-pami/buildtools/pami_build_port/../pami/components/devices/shmem/shaddr/CMAShaddr.h:164: >> size_t PAMI::Dev >> ice::Shmem::CMAShaddr::read_impl(PAMI::Memregion*, size_t, >> PAMI::Memregion*, size_t, size_t, bool*): Assertion `cbytes > 0' failed. >> [e28n07:864557] *** Process received signal *** >> [e28n07:864557] Signal: Aborted (6) >> [e28n07:864557] Signal code: (-6) >> [e28n07:864557] [ 0] >> linux-vdso64.so.1(__kernel_sigtramp_rt64+0x0)[0x2000000604d8] >> [e28n07:864557] [ 1] /lib64/glibc-hwcaps/power9/libc-2.28.so >> (gsignal+0xd8)[0x200005d796f8] >> [e28n07:864557] [ 2] /lib64/glibc-hwcaps/power9/libc-2.28.so >> (abort+0x164)[0x200005d53ff4] >> [e28n07:864557] [ 3] /lib64/glibc-hwcaps/power9/libc-2.28.so >> (+0x3d280)[0x200005d6d280] >> [e28n07:864557] [ 4] [e28n07:864558] *** Process received signal *** >> [e28n07:864558] Signal: Aborted (6) >> [e28n07:864558] Signal code: (-6) >> [e28n07:864558] [ 0] >> linux-vdso64.so.1(__kernel_sigtramp_rt64+0x0)[0x2000000604d8] >> [e28n07:864558] [ 1] /lib64/glibc-hwcaps/power9/libc-2.28.so >> (gsignal+0xd8)[0x200005d796f8] >> [e28n07:864558] [ 2] /lib64/glibc-hwcaps/power9/libc-2.28.so >> (abort+0x164)[0x200005d53ff4] >> [e28n07:864558] [ 3] /lib64/glibc-hwcaps/power9/libc-2.28.so >> (+0x3d280)[0x200005d6d280] >> [e28n07:864558] [ 4] /lib64/glibc-hwcaps/power9/libc-2.28.so >> (__assert_fail+0x64)[0x200005d6d324] >> [e28n07:864557] [ 5] >> /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 >> >> (_ZN4PAMI8Protocol3Get7GetRdmaINS_6Device5Shmem8DmaModelINS3_11ShmemDeviceINS_4Fifo8WrapFifoINS7_10FifoPacketILj64ELj4096EEENS_7Counter15IndirectBoundedINS_6Atomic12NativeAt >> >> omicEEELj256EEENSB_8IndirectINSB_6NativeEEENS4_9CMAShaddrELj256ELj512EEELb0EEESL_E6simpleEP18pami_rget_simple_t+0x1d8)[0x20007f3971d8] >> [e28n07:864557] [ 6] >> /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 >> >> (_ZN4PAMI8Protocol3Get13CompositeRGetINS1_4RGetES3_E6simpleEP18pami_rget_simple_t+0x40)[0x20007f2ecc10] >> [e28n07:864557] [ 7] >> /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 >> (_ZN4PAMI7Context9rget_implEP18pami_rget_simple_t+0x28c)[0x20007f31a78c] >> [e28n07:864557] [ 8] >> /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 >> (PAMI_Rget+0x18)[0x20007f2d94a8] >> [e28n07:864557] [ 9] >> /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/spectrum_mpi/mca_pml_p >> ami.so(process_rndv_msg+0x46c)[0x2000a80159ac] >> [e28n07:864557] [10] >> /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/spectrum_mpi/mca_pml_p >> ami.so(pml_pami_recv_rndv_cb+0x2bc)[0x2000a801670c] >> [e28n07:864557] [11] /lib64/glibc-hwcaps/power9/libc-2.28.so >> (__assert_fail+0x64)[0x200005d6d324] >> [e28n07:864558] [ 5] >> /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 >> >> (_ZN4PAMI8Protocol3Get7GetRdmaINS_6Device5Shmem8DmaModelINS3_11ShmemDeviceINS_4Fifo8WrapFifoINS7_10FifoPacketILj64ELj4096EEENS_7Counter15IndirectBoundedINS_6Atomic12NativeAt >> >> omicEEELj256EEENSB_8IndirectINSB_6NativeEEENS4_9CMAShaddrELj256ELj512EEELb0EEESL_E6simpleEP18pami_rget_simple_t+0x1d8)[0x20007f3971d8] >> [e28n07:864558] [ 6] >> /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 >> >> (_ZN4PAMI8Protocol3Get13CompositeRGetINS1_4RGetES3_E6simpleEP18pami_rget_simple_t+0x40)[0x20007f2ecc10] >> [e28n07:864558] [ 7] >> /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 >> (_ZN4PAMI7Context9rget_implEP18pami_rget_simple_t+0x28c)[0x20007f31a78c] >> [e28n07:864558] [ 8] >> /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 >> (PAMI_Rget+0x18)[0x20007f2d94a8] >> [e28n07:864558] [ 9] >> /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/spectrum_mpi/mca_pml_p >> ami.so(process_rndv_msg+0x46c)[0x2000a80159ac] >> [e28n07:864558] [10] >> /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/spectrum_mpi/mca_pml_p >> ami.so(pml_pami_recv_rndv_cb+0x2bc)[0x2000a801670c] >> [e28n07:864558] [11] >> /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 >> >> (_ZN4PAMI8Protocol4Send11EagerSimpleINS_6Device5Shmem11PacketModelINS3_11ShmemDeviceINS_4Fifo8WrapFifoINS7_10FifoPacketILj64ELj4096EEENS_7Counter15IndirectBoundedINS_6Atomic >> >> 12NativeAtomicEEELj256EEENSB_8IndirectINSB_6NativeEEENS4_9CMAShaddrELj256ELj512EEEEELNS1_15configuration_tE5EE15dispatch_packedEPvSP_mSP_SP_+0x4c)[0x20007f2e30ac] >> [e28n07:864557] [12] >> /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 >> (PAMI_Context_advancev+0x6b0)[0x20007f2da540] >> [e28n07:864557] [13] >> /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/spectrum_mpi/mca_pml_p >> ami.so(mca_pml_pami_progress+0x34)[0x2000a80073e4] >> [e28n07:864557] [14] >> /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/libopen-pal.so.3(opal_ >> progress+0x6c)[0x20003d60640c] >> [e28n07:864557] [15] >> /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/libmpi_ibm.so.3(ompi_r >> equest_default_wait_all+0x144)[0x2000034c4b04] >> [e28n07:864557] [16] >> /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/libmpi_ibm.so.3(PMPI_W >> aitall+0x10c)[0x20000352790c] >> [e28n07:864557] [17] >> /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 >> >> (_ZN4PAMI8Protocol4Send11EagerSimpleINS_6Device5Shmem11PacketModelINS3_11ShmemDeviceINS_4Fifo8WrapFifoINS7_10FifoPacketILj64ELj4096EEENS_7Counter15IndirectBoundedINS_6Atomic >> >> 12NativeAtomicEEELj256EEENSB_8IndirectINSB_6NativeEEENS4_9CMAShaddrELj256ELj512EEEEELNS1_15configuration_tE5EE15dispatch_packedEPvSP_mSP_SP_+0x4c)[0x20007f2e30ac] >> [e28n07:864558] [12] >> /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 >> (PAMI_Context_advancev+0x6b0)[0x20007f2da540] >> [e28n07:864558] [13] >> /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/spectrum_mpi/mca_pml_p >> ami.so(mca_pml_pami_progress+0x34)[0x2000a80073e4] >> [e28n07:864558] [14] >> /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/libopen-pal.so.3(opal_ >> progress+0x6c)[0x20003d60640c] >> [e28n07:864558] [15] >> /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/libmpi_ibm.so.3(ompi_r >> equest_default_wait_all+0x144)[0x2000034c4b04] >> [e28n07:864558] [16] >> /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/libmpi_ibm.so.3(PMPI_W >> aitall+0x10c)[0x20000352790c] >> [e28n07:864558] [17] >> /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x3ca7b0)[0x2000004ea7b0] >> [e28n07:864557] [18] >> /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x3ca7b0)[0x2000004ea7b0] >> [e28n07:864558] [18] >> /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x3c5e68)[0x2000004e5e68] >> [e28n07:864557] [19] >> /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x3c5e68)[0x2000004e5e68] >> [e28n07:864558] [19] >> /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(PetscSFBcastEnd+0x74)[0x2000004c9214] >> [e28n07:864557] [20] >> /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(PetscSFBcastEnd+0x74)[0x2000004c9214] >> [e28n07:864558] [20] >> /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x3b4cb0)[0x2000004d4cb0] >> [e28n07:864557] [21] >> /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x3b4cb0)[0x2000004d4cb0] >> [e28n07:864558] [21] >> /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(VecScatterEnd+0x178)[0x2000004dd038] >> [e28n07:864558] [22] >> /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(VecScatterEnd+0x178)[0x2000004dd038] >> [e28n07:864557] [22] >> /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x1112be0)[0x200001232be0] >> [e28n07:864558] [23] >> /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x1112be0)[0x200001232be0] >> [e28n07:864557] [23] >> /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(DMGlobalToLocalEnd+0x470)[0x200000e9b0f0] >> [e28n07:864557] [24] >> /gpfs/alpine2/mat267/proj-shared/code/xolotl-stable-cuda/xolotl/solver/libxolotlSolver.so(_ZN6xolotl6solver11PetscSolver11rhsFunctionEP5_p_TSdP6_p_VecS5 >> _+0xc4)[0x200005f710d4] >> [e28n07:864557] [25] >> /gpfs/alpine2/mat267/proj-shared/code/xolotl-stable-cuda/xolotl/solver/libxolotlSolver.so(_ZN6xolotl6solver11RHSFunctionEP5_p_TSdP6_p_VecS4_Pv+0x2c)[0x2 >> 00005f7130c] >> [e28n07:864557] [26] >> /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(DMGlobalToLocalEnd+0x470)[0x200000e9b0f0] >> [e28n07:864558] [24] >> /gpfs/alpine2/mat267/proj-shared/code/xolotl-stable-cuda/xolotl/solver/libxolotlSolver.so(_ZN6xolotl6solver11PetscSolver11rhsFunctionEP5_p_TSdP6_p_VecS5 >> _+0xc4)[0x200005f710d4] >> [e28n07:864558] [25] >> /gpfs/alpine2/mat267/proj-shared/code/xolotl-stable-cuda/xolotl/solver/libxolotlSolver.so(_ZN6xolotl6solver11RHSFunctionEP5_p_TSdP6_p_VecS4_Pv+0x2c)[0x2 >> 00005f7130c] >> [e28n07:864558] [26] >> /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(TSComputeRHSFunction+0x1bc)[0x2000017621dc] >> [e28n07:864557] [27] >> /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(TSComputeRHSFunction+0x1bc)[0x2000017621dc] >> [e28n07:864558] [27] >> /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(TSComputeIFunction+0x418)[0x200001763ad8] >> [e28n07:864557] [28] >> /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(TSComputeIFunction+0x418)[0x200001763ad8] >> [e28n07:864558] [28] >> /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x16f2ef0)[0x200001812ef0] >> [e28n07:864557] [29] >> /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x16f2ef0)[0x200001812ef0] >> [e28n07:864558] [29] >> /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(TSStep+0x228)[0x200001768088] >> [e28n07:864557] *** End of error message *** >> >> /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(TSStep+0x228)[0x200001768088] >> [e28n07:864558] *** End of error message *** >> >> It seems to be pointing to >> https://urldefense.us/v3/__https://petsc.org/release/manualpages/PetscSF/PetscSFBcastEnd/__;!!G_uCfscf7eWS!cq445CXteimKBMZKF1HQqgEFTwREIrbMMm5Cn-sCV3wDm2A3tixBsge_FLfW-3YKRxtbYWK9D29cMq338kMste-N8hvu$ >> >> so I wanted to check if you had seen this type of error before and if it >> could be related to how the code is compiled or run. Let me know if I can >> provide any additional information. >> >> Best, >> >> Sophie >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From sblondel at utk.edu Mon Mar 4 13:14:50 2024 From: sblondel at utk.edu (Blondel, Sophie) Date: Mon, 4 Mar 2024 19:14:50 +0000 Subject: [petsc-users] PAMI error on Summit In-Reply-To: References: Message-ID: Thank you Junchao for looking into it. I managed to build a previous version of Xolotl that uses Kokkos (but PETSc is default PETSc without Kokkos) so that we have at least partial GPU support. Best, Sophie ________________________________ From: Junchao Zhang Sent: Monday, March 4, 2024 12:10 To: Blondel, Sophie Cc: xolotl-psi-development at lists.sourceforge.net ; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PAMI error on Summit You don't often get email from junchao.zhang at gmail.com. Learn why this is important Hi, Sophie, I tried various modules and compilers on Summit and failed to find one that works with gpu aware mpi. The one that could build petsc and kokkos was "module load cuda/11.7.1 gcc/9.3.0-compiler_only spectrum-mpi essl netlib-lapack". But it only worked with "-use_gpu_aware_mpi 0". Without it, I saw code crashes. From what I can see, the gpu-aware mpi on Summit is an unusable and unmaintained state. --Junchao Zhang On Fri, Mar 1, 2024 at 3:58?PM Junchao Zhang > wrote: It is weird, with jsrun --smpiargs "-gpu" -n 6 -a 1 -c 1 -g 1 /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos/src/ksp/ksp/tutorials/bench_kspsolve -mat_type aijkokkos -use_gpu_aware_mpi 1 petsc tried to test if the MPI is gpu aware (by doing an MPI_Allreduce on device buffers). It tried and found it was not, so it threw out the complaint in the error message. >From https://urldefense.us/v3/__https://docs.olcf.ornl.gov/systems/summit_user_guide.html*cuda-aware-mpi__;Iw!!G_uCfscf7eWS!b1KHrO6bitMG2kKMIlPSxhA6aheY_aEXvUUBoYN6M7I3wMuBqRQGv_XD9mEneP_YWSx5VFtlcSkRNExrRwOcNePg$ , I think your flags were right. I just got my Summit account reactivated today. I will give it a try. --Junchao Zhang On Fri, Mar 1, 2024 at 3:32?PM Blondel, Sophie > wrote: I have been using --smpiargs "-gpu". I tried the benchmark with "jsrun --smpiargs "-gpu" -n 6 -a 1 -c 1 -g 1 /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos/src/ksp/ksp/tutorials/bench_kspsolve -mat_type aijkokkos -use_gpu_aware_mpi 0" and it seems to work: Fri Mar 1 16:27:14 EST 2024 =========================================== Test: KSP performance - Poisson Input matrix: 27-pt finite difference stencil -n 100 DoFs = 1000000 Number of nonzeros = 26463592 Step1 - creating Vecs and Mat... Step2 - running KSPSolve()... Step3 - calculating error norm... Error norm: 5.591e-02 KSP iters: 63 KSPSolve: 3.16646 seconds FOM: 3.158e+05 DoFs/sec =========================================== ------------------------------------------------------------ Sender: LSF System Subject: Job 3322694: in cluster Done Job was submitted from host by user in cluster at Fri Mar 1 16:26:58 2024 Job was executed on host(s) <1*batch3>, in queue , as user in cluster at Fri Mar 1 16:27:00 2024 <42*a35n05> was used as the home directory. was used as the working directory. Started at Fri Mar 1 16:27:00 2024 Terminated at Fri Mar 1 16:27:26 2024 Results reported at Fri Mar 1 16:27:26 2024 The output (if any) is above this job summary. If I switch to "jsrun --smpiargs "-gpu" -n 6 -a 1 -c 1 -g 1 /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos/src/ksp/ksp/tutorials/bench_kspsolve -mat_type aijkokkos -use_gpu_aware_mpi 1" it complains: Fri Mar 1 16:25:02 EST 2024 =========================================== Test: KSP performance - Poisson Input matrix: 27-pt finite difference stencil -n 100 DoFs = 1000000 Number of nonzeros = 26463592 Step1 - creating Vecs and Mat... [5]PETSC ERROR: PETSc is configured with GPU support, but your MPI is not GPU-aware. For better performance, please use a GPU-aware MPI. [5]PETSC ERROR: If you do not care, add option -use_gpu_aware_mpi 0. To not see the message again, add the option to your .petscrc, OR add it to the env var PETSC_OPTIONS. [5]PETSC ERROR: If you do care, for IBM Spectrum MPI on OLCF Summit, you may need jsrun --smpiargs=-gpu. [5]PETSC ERROR: For Open MPI, you need to configure it --with-cuda (https://urldefense.us/v3/__https://www.open-mpi.org/faq/?category=buildcuda__;!!G_uCfscf7eWS!b1KHrO6bitMG2kKMIlPSxhA6aheY_aEXvUUBoYN6M7I3wMuBqRQGv_XD9mEneP_YWSx5VFtlcSkRNExrR2coCcH6$ ) [5]PETSC ERROR: For MVAPICH2-GDR, you need to set MV2_USE_CUDA=1 (https://urldefense.us/v3/__http://mvapich.cse.ohio-state.edu/userguide/gdr/__;!!G_uCfscf7eWS!b1KHrO6bitMG2kKMIlPSxhA6aheY_aEXvUUBoYN6M7I3wMuBqRQGv_XD9mEneP_YWSx5VFtlcSkRNExrRz7NgbVW$ ) [5]PETSC ERROR: For Cray-MPICH, you need to set MPICH_GPU_SUPPORT_ENABLED=1 (man mpi to see manual of cray-mpich) -------------------------------------------------------------------------- MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_SELF with errorcode 76. Best, Sophie ________________________________ From: Junchao Zhang > Sent: Thursday, February 29, 2024 17:09 To: Blondel, Sophie > Cc: xolotl-psi-development at lists.sourceforge.net >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] PAMI error on Summit You don't often get email from junchao.zhang at gmail.com. Learn why this is important Could you try a petsc example to see if the environment is good? For example, cd src/ksp/ksp/tutorials make bench_kspsolve mpirun -n 6 ./bench_kspsolve -mat_type aijkokkos -use_gpu_aware_mpi {0 or 1} BTW, I remember to use gpu-aware mpi on Summit, one needs to pass --smpiargs "-gpu" to jsrun --Junchao Zhang On Thu, Feb 29, 2024 at 3:22?PM Blondel, Sophie via petsc-users > wrote: I still get the same error when deactivating GPU-aware MPI. I also tried unloading spectrum MPI and using openMPI instead (recompiling everything) and I get a segfault in PETSc in that case (still using GPU-aware MPI I think, at least not explicitly ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd I still get the same error when deactivating GPU-aware MPI. I also tried unloading spectrum MPI and using openMPI instead (recompiling everything) and I get a segfault in PETSc in that case (still using GPU-aware MPI I think, at least not explicitly turning it off): 0 TS dt 1e-12 time 0. [ERROR] [0]PETSC ERROR: [ERROR] ------------------------------------------------------------------------ [ERROR] [0]PETSC ERROR: [ERROR] Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [ERROR] [0]PETSC ERROR: [ERROR] Try option -start_in_debugger or -on_error_attach_debugger [ERROR] [0]PETSC ERROR: [ERROR] or see https://urldefense.us/v3/__https://petsc.org/release/faq/*valgrind__;Iw!!G_uCfscf7eWS!b1KHrO6bitMG2kKMIlPSxhA6aheY_aEXvUUBoYN6M7I3wMuBqRQGv_XD9mEneP_YWSx5VFtlcSkRNExrRz1PahPX$ and https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!b1KHrO6bitMG2kKMIlPSxhA6aheY_aEXvUUBoYN6M7I3wMuBqRQGv_XD9mEneP_YWSx5VFtlcSkRNExrRxqFIxGa$ [ERROR] [0]PETSC ERROR: [ERROR] or try https://urldefense.us/v3/__https://docs.nvidia.com/cuda/cuda-memcheck/index.html__;!!G_uCfscf7eWS!b1KHrO6bitMG2kKMIlPSxhA6aheY_aEXvUUBoYN6M7I3wMuBqRQGv_XD9mEneP_YWSx5VFtlcSkRNExrRyLPkW_4$ on NVIDIA CUDA systems to find memory corruption errors [ERROR] [0]PETSC ERROR: [ERROR] configure using --with-debugging=yes, recompile, link, and run [ERROR] [0]PETSC ERROR: [ERROR] to get more information on the crash. [ERROR] [0]PETSC ERROR: [ERROR] Run with -malloc_debug to check if memory corruption is causing the crash. -------------------------------------------------------------------------- Best, Sophie ________________________________ From: Blondel, Sophie via Xolotl-psi-development > Sent: Thursday, February 29, 2024 10:17 To: xolotl-psi-development at lists.sourceforge.net >; petsc-users at mcs.anl.gov > Subject: [Xolotl-psi-development] PAMI error on Summit Hi, I am using PETSc build with the Kokkos CUDA backend on Summit but when I run my code with multiple MPI tasks I get the following error: 0 TS dt 1e-12 time 0. errno 14 pid 864558 xolotl: /__SMPI_build_dir__________________________/ibmsrc/pami/ibm-pami/buildtools/pami_build_port/../pami/components/devices/shmem/shaddr/CMAShaddr.h:164: size_t PAMI::Dev ice::Shmem::CMAShaddr::read_impl(PAMI::Memregion*, size_t, PAMI::Memregion*, size_t, size_t, bool*): Assertion `cbytes > 0' failed. errno 14 pid 864557 xolotl: /__SMPI_build_dir__________________________/ibmsrc/pami/ibm-pami/buildtools/pami_build_port/../pami/components/devices/shmem/shaddr/CMAShaddr.h:164: size_t PAMI::Dev ice::Shmem::CMAShaddr::read_impl(PAMI::Memregion*, size_t, PAMI::Memregion*, size_t, size_t, bool*): Assertion `cbytes > 0' failed. [e28n07:864557] *** Process received signal *** [e28n07:864557] Signal: Aborted (6) [e28n07:864557] Signal code: (-6) [e28n07:864557] [ 0] linux-vdso64.so.1(__kernel_sigtramp_rt64+0x0)[0x2000000604d8] [e28n07:864557] [ 1] /lib64/glibc-hwcaps/power9/libc-2.28.so(gsignal+0xd8)[0x200005d796f8] [e28n07:864557] [ 2] /lib64/glibc-hwcaps/power9/libc-2.28.so(abort+0x164)[0x200005d53ff4] [e28n07:864557] [ 3] /lib64/glibc-hwcaps/power9/libc-2.28.so(+0x3d280)[0x200005d6d280] [e28n07:864557] [ 4] [e28n07:864558] *** Process received signal *** [e28n07:864558] Signal: Aborted (6) [e28n07:864558] Signal code: (-6) [e28n07:864558] [ 0] linux-vdso64.so.1(__kernel_sigtramp_rt64+0x0)[0x2000000604d8] [e28n07:864558] [ 1] /lib64/glibc-hwcaps/power9/libc-2.28.so(gsignal+0xd8)[0x200005d796f8] [e28n07:864558] [ 2] /lib64/glibc-hwcaps/power9/libc-2.28.so(abort+0x164)[0x200005d53ff4] [e28n07:864558] [ 3] /lib64/glibc-hwcaps/power9/libc-2.28.so(+0x3d280)[0x200005d6d280] [e28n07:864558] [ 4] /lib64/glibc-hwcaps/power9/libc-2.28.so(__assert_fail+0x64)[0x200005d6d324] [e28n07:864557] [ 5] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 (_ZN4PAMI8Protocol3Get7GetRdmaINS_6Device5Shmem8DmaModelINS3_11ShmemDeviceINS_4Fifo8WrapFifoINS7_10FifoPacketILj64ELj4096EEENS_7Counter15IndirectBoundedINS_6Atomic12NativeAt omicEEELj256EEENSB_8IndirectINSB_6NativeEEENS4_9CMAShaddrELj256ELj512EEELb0EEESL_E6simpleEP18pami_rget_simple_t+0x1d8)[0x20007f3971d8] [e28n07:864557] [ 6] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 (_ZN4PAMI8Protocol3Get13CompositeRGetINS1_4RGetES3_E6simpleEP18pami_rget_simple_t+0x40)[0x20007f2ecc10] [e28n07:864557] [ 7] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 (_ZN4PAMI7Context9rget_implEP18pami_rget_simple_t+0x28c)[0x20007f31a78c] [e28n07:864557] [ 8] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 (PAMI_Rget+0x18)[0x20007f2d94a8] [e28n07:864557] [ 9] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/spectrum_mpi/mca_pml_p ami.so(process_rndv_msg+0x46c)[0x2000a80159ac] [e28n07:864557] [10] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/spectrum_mpi/mca_pml_p ami.so(pml_pami_recv_rndv_cb+0x2bc)[0x2000a801670c] [e28n07:864557] [11] /lib64/glibc-hwcaps/power9/libc-2.28.so(__assert_fail+0x64)[0x200005d6d324] [e28n07:864558] [ 5] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 (_ZN4PAMI8Protocol3Get7GetRdmaINS_6Device5Shmem8DmaModelINS3_11ShmemDeviceINS_4Fifo8WrapFifoINS7_10FifoPacketILj64ELj4096EEENS_7Counter15IndirectBoundedINS_6Atomic12NativeAt omicEEELj256EEENSB_8IndirectINSB_6NativeEEENS4_9CMAShaddrELj256ELj512EEELb0EEESL_E6simpleEP18pami_rget_simple_t+0x1d8)[0x20007f3971d8] [e28n07:864558] [ 6] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 (_ZN4PAMI8Protocol3Get13CompositeRGetINS1_4RGetES3_E6simpleEP18pami_rget_simple_t+0x40)[0x20007f2ecc10] [e28n07:864558] [ 7] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 (_ZN4PAMI7Context9rget_implEP18pami_rget_simple_t+0x28c)[0x20007f31a78c] [e28n07:864558] [ 8] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 (PAMI_Rget+0x18)[0x20007f2d94a8] [e28n07:864558] [ 9] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/spectrum_mpi/mca_pml_p ami.so(process_rndv_msg+0x46c)[0x2000a80159ac] [e28n07:864558] [10] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/spectrum_mpi/mca_pml_p ami.so(pml_pami_recv_rndv_cb+0x2bc)[0x2000a801670c] [e28n07:864558] [11] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 (_ZN4PAMI8Protocol4Send11EagerSimpleINS_6Device5Shmem11PacketModelINS3_11ShmemDeviceINS_4Fifo8WrapFifoINS7_10FifoPacketILj64ELj4096EEENS_7Counter15IndirectBoundedINS_6Atomic 12NativeAtomicEEELj256EEENSB_8IndirectINSB_6NativeEEENS4_9CMAShaddrELj256ELj512EEEEELNS1_15configuration_tE5EE15dispatch_packedEPvSP_mSP_SP_+0x4c)[0x20007f2e30ac] [e28n07:864557] [12] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 (PAMI_Context_advancev+0x6b0)[0x20007f2da540] [e28n07:864557] [13] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/spectrum_mpi/mca_pml_p ami.so(mca_pml_pami_progress+0x34)[0x2000a80073e4] [e28n07:864557] [14] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/libopen-pal.so.3(opal_ progress+0x6c)[0x20003d60640c] [e28n07:864557] [15] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/libmpi_ibm.so.3(ompi_r equest_default_wait_all+0x144)[0x2000034c4b04] [e28n07:864557] [16] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/libmpi_ibm.so.3(PMPI_W aitall+0x10c)[0x20000352790c] [e28n07:864557] [17] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 (_ZN4PAMI8Protocol4Send11EagerSimpleINS_6Device5Shmem11PacketModelINS3_11ShmemDeviceINS_4Fifo8WrapFifoINS7_10FifoPacketILj64ELj4096EEENS_7Counter15IndirectBoundedINS_6Atomic 12NativeAtomicEEELj256EEENSB_8IndirectINSB_6NativeEEENS4_9CMAShaddrELj256ELj512EEEEELNS1_15configuration_tE5EE15dispatch_packedEPvSP_mSP_SP_+0x4c)[0x20007f2e30ac] [e28n07:864558] [12] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/pami_port/libpami.so.3 (PAMI_Context_advancev+0x6b0)[0x20007f2da540] [e28n07:864558] [13] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/spectrum_mpi/mca_pml_p ami.so(mca_pml_pami_progress+0x34)[0x2000a80073e4] [e28n07:864558] [14] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/libopen-pal.so.3(opal_ progress+0x6c)[0x20003d60640c] [e28n07:864558] [15] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/libmpi_ibm.so.3(ompi_r equest_default_wait_all+0x144)[0x2000034c4b04] [e28n07:864558] [16] /sw/summit/spack-envs/summit-plus/opt/gcc-12.1.0/spectrum-mpi-10.4.0.6-20230210-db5xakaaqowbhp3nqwebpxrdbwtm4knu/container/../lib/libmpi_ibm.so.3(PMPI_W aitall+0x10c)[0x20000352790c] [e28n07:864558] [17] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x3ca7b0)[0x2000004ea7b0] [e28n07:864557] [18] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x3ca7b0)[0x2000004ea7b0] [e28n07:864558] [18] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x3c5e68)[0x2000004e5e68] [e28n07:864557] [19] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x3c5e68)[0x2000004e5e68] [e28n07:864558] [19] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(PetscSFBcastEnd+0x74)[0x2000004c9214] [e28n07:864557] [20] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(PetscSFBcastEnd+0x74)[0x2000004c9214] [e28n07:864558] [20] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x3b4cb0)[0x2000004d4cb0] [e28n07:864557] [21] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x3b4cb0)[0x2000004d4cb0] [e28n07:864558] [21] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(VecScatterEnd+0x178)[0x2000004dd038] [e28n07:864558] [22] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(VecScatterEnd+0x178)[0x2000004dd038] [e28n07:864557] [22] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x1112be0)[0x200001232be0] [e28n07:864558] [23] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x1112be0)[0x200001232be0] [e28n07:864557] [23] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(DMGlobalToLocalEnd+0x470)[0x200000e9b0f0] [e28n07:864557] [24] /gpfs/alpine2/mat267/proj-shared/code/xolotl-stable-cuda/xolotl/solver/libxolotlSolver.so(_ZN6xolotl6solver11PetscSolver11rhsFunctionEP5_p_TSdP6_p_VecS5 _+0xc4)[0x200005f710d4] [e28n07:864557] [25] /gpfs/alpine2/mat267/proj-shared/code/xolotl-stable-cuda/xolotl/solver/libxolotlSolver.so(_ZN6xolotl6solver11RHSFunctionEP5_p_TSdP6_p_VecS4_Pv+0x2c)[0x2 00005f7130c] [e28n07:864557] [26] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(DMGlobalToLocalEnd+0x470)[0x200000e9b0f0] [e28n07:864558] [24] /gpfs/alpine2/mat267/proj-shared/code/xolotl-stable-cuda/xolotl/solver/libxolotlSolver.so(_ZN6xolotl6solver11PetscSolver11rhsFunctionEP5_p_TSdP6_p_VecS5 _+0xc4)[0x200005f710d4] [e28n07:864558] [25] /gpfs/alpine2/mat267/proj-shared/code/xolotl-stable-cuda/xolotl/solver/libxolotlSolver.so(_ZN6xolotl6solver11RHSFunctionEP5_p_TSdP6_p_VecS4_Pv+0x2c)[0x2 00005f7130c] [e28n07:864558] [26] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(TSComputeRHSFunction+0x1bc)[0x2000017621dc] [e28n07:864557] [27] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(TSComputeRHSFunction+0x1bc)[0x2000017621dc] [e28n07:864558] [27] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(TSComputeIFunction+0x418)[0x200001763ad8] [e28n07:864557] [28] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(TSComputeIFunction+0x418)[0x200001763ad8] [e28n07:864558] [28] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x16f2ef0)[0x200001812ef0] [e28n07:864557] [29] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(+0x16f2ef0)[0x200001812ef0] [e28n07:864558] [29] /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(TSStep+0x228)[0x200001768088] [e28n07:864557] *** End of error message *** /gpfs/alpine2/mat267/proj-shared/dependencies/petsc-kokkos-cuda/lib/libpetsc.so.3.020(TSStep+0x228)[0x200001768088] [e28n07:864558] *** End of error message *** It seems to be pointing to https://urldefense.us/v3/__https://petsc.org/release/manualpages/PetscSF/PetscSFBcastEnd/__;!!G_uCfscf7eWS!b1KHrO6bitMG2kKMIlPSxhA6aheY_aEXvUUBoYN6M7I3wMuBqRQGv_XD9mEneP_YWSx5VFtlcSkRNExrR2lL1PKO$ so I wanted to check if you had seen this type of error before and if it could be related to how the code is compiled or run. Let me know if I can provide any additional information. Best, Sophie -------------- next part -------------- An HTML attachment was scrubbed... URL: From hongzhang at anl.gov Mon Mar 4 18:34:45 2024 From: hongzhang at anl.gov (Zhang, Hong) Date: Tue, 5 Mar 2024 00:34:45 +0000 Subject: [petsc-users] 'Preconditioning' with lower-order method In-Reply-To: References: <8734t6woec.fsf@jedbrown.org> Message-ID: Ling, Are you using PETSc TS? If so, it may worth trying Crank-Nicolson first to see if the nonlinear solve becomes faster. In addition, you can try to improve the performance by pruning the Jacobian matrix. TSPruneIJacobianColor() sometimes can reduce the number of colors especially for high-order methods and make your Jacobian matrix more compact. An example of usage can be found here. If you are not using TS, there is a SNES version SNESPruneJacobianColor() for the same functionality. Hong (Mr.) On Mar 3, 2024, at 11:48 PM, Zou, Ling via petsc-users wrote: From: Jed Brown > Date: Sunday, March 3, 2024 at 11:35 PM To: Zou, Ling >, Barry Smith > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] 'Preconditioning' with lower-order method If you're having PETSc use coloring and have confirmed that the stencil is sufficient, then it would be nonsmoothness (again, consider the limiter you've chosen) preventing quadratic convergence (assuming that doesn't kick in eventually). Note ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd If you're having PETSc use coloring and have confirmed that the stencil is sufficient, then it would be nonsmoothness (again, consider the limiter you've chosen) preventing quadratic convergence (assuming that doesn't kick in eventually). ? Yes, I do use coloring, and I do provide sufficient stencil, i.e., neighbor?s neighbor. The sufficiency is confirmed by PETSc?s -snes_test_jacobian and -snes_test_jacobian_view options. Note that assembling a Jacobian of a second order TVD operator requires at least second neighbors while the first order needs only first neighbors, thus is much sparser and needs fewer colors to compute. ? In my code implementation, when marking the Jacobian nonzero pattern, I don?t differentiate FV1 or FV2, I always use the FV2 stencil, so it?s a bit ?fat? for the FV1 method, but worked just fine. I expect you're either not exploiting that in the timings or something else is amiss. You can run with `-log_view -snes_view -ksp_converged_reason` to get a bit more information about what's happening. ? The attached is screen output as you suggest. The linear and nonlinear performance of FV2 is both worse from the output. FV2: Time Step 149, time = 13229.7, dt = 100 NL Step = 0, fnorm = 7.80968E-03 Linear solve converged due to CONVERGED_RTOL iterations 26 NL Step = 1, fnorm = 7.65731E-03 Linear solve converged due to CONVERGED_RTOL iterations 24 NL Step = 2, fnorm = 6.85034E-03 Linear solve converged due to CONVERGED_RTOL iterations 27 NL Step = 3, fnorm = 6.11873E-03 Linear solve converged due to CONVERGED_RTOL iterations 25 NL Step = 4, fnorm = 1.57347E-03 Linear solve converged due to CONVERGED_RTOL iterations 27 NL Step = 5, fnorm = 9.03536E-04 SNES Object: 1 MPI process type: newtonls maximum iterations=20, maximum function evaluations=10000 tolerances: relative=1e-08, absolute=1e-06, solution=1e-08 total number of linear solver iterations=129 total number of function evaluations=144 norm schedule ALWAYS Jacobian is applied matrix-free with differencing Preconditioning Jacobian is built using finite differences with coloring SNESLineSearch Object: 1 MPI process type: bt interpolation: cubic alpha=1.000000e-04 maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI process type: gmres restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=100, initial guess is zero tolerances: relative=0.0001, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: ilu out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift to prevent zero pivot [NONZERO] matrix ordering: rcm factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI process type: seqaij rows=8715, cols=8715 package used to perform factorization: petsc total: nonzeros=38485, allocated nonzeros=38485 not using I-node routines linear system matrix followed by preconditioner matrix: Mat Object: 1 MPI process type: mffd rows=8715, cols=8715 Matrix-free approximation: err=1.49012e-08 (relative error in function evaluation) Using wp compute h routine Does not compute normU Mat Object: 1 MPI process type: seqaij rows=8715, cols=8715 total: nonzeros=38485, allocated nonzeros=38485 total number of mallocs used during MatSetValues calls=0 not using I-node routines Solve Converged! FV1: Time Step 149, time = 13229.7, dt = 100 NL Step = 0, fnorm = 7.90072E-03 Linear solve converged due to CONVERGED_RTOL iterations 12 NL Step = 1, fnorm = 2.01919E-04 Linear solve converged due to CONVERGED_RTOL iterations 17 NL Step = 2, fnorm = 1.06960E-05 Linear solve converged due to CONVERGED_RTOL iterations 15 NL Step = 3, fnorm = 2.41683E-09 SNES Object: 1 MPI process type: newtonls maximum iterations=20, maximum function evaluations=10000 tolerances: relative=1e-08, absolute=1e-06, solution=1e-08 total number of linear solver iterations=44 total number of function evaluations=51 norm schedule ALWAYS Jacobian is applied matrix-free with differencing Preconditioning Jacobian is built using finite differences with coloring SNESLineSearch Object: 1 MPI process type: bt interpolation: cubic alpha=1.000000e-04 maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI process type: gmres restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=100, initial guess is zero tolerances: relative=0.0001, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: ilu out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift to prevent zero pivot [NONZERO] matrix ordering: rcm factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI process type: seqaij rows=8715, cols=8715 package used to perform factorization: petsc total: nonzeros=38485, allocated nonzeros=38485 not using I-node routines linear system matrix followed by preconditioner matrix: Mat Object: 1 MPI process type: mffd rows=8715, cols=8715 Matrix-free approximation: err=1.49012e-08 (relative error in function evaluation) Using wp compute h routine Does not compute normU Mat Object: 1 MPI process type: seqaij rows=8715, cols=8715 total: nonzeros=38485, allocated nonzeros=38485 total number of mallocs used during MatSetValues calls=0 not using I-node routines Solve Converged! "Zou, Ling via petsc-users" > writes: > Barry, thank you. > I am not sure if I exactly follow you on this: > ?Are you forming the Jacobian for the first and second order cases inside of Newton?? > > The problem that we deal with, heat/mass transfer in heterogeneous systems (reactor system), is generally small in terms of size, i.e., # of DOFs (several k to maybe 100k level), so for now, I completely rely on PETSc to compute Jacobian, i.e., finite-differencing. > > That?s a good suggestion to see the time spent during various events. > What motivated me to try the options are the following observations. > > 2nd order FVM: > > Time Step 149, time = 13229.7, dt = 100 > > NL Step = 0, fnorm = 7.80968E-03 > > NL Step = 1, fnorm = 7.65731E-03 > > NL Step = 2, fnorm = 6.85034E-03 > > NL Step = 3, fnorm = 6.11873E-03 > > NL Step = 4, fnorm = 1.57347E-03 > > NL Step = 5, fnorm = 9.03536E-04 > > Solve Converged! > > 1st order FVM: > > Time Step 149, time = 13229.7, dt = 100 > > NL Step = 0, fnorm = 7.90072E-03 > > NL Step = 1, fnorm = 2.01919E-04 > > NL Step = 2, fnorm = 1.06960E-05 > > NL Step = 3, fnorm = 2.41683E-09 > > Solve Converged! > > Notice the obvious ?stagnant? in residual for the 2nd order method while not in the 1st order. > For the same problem, the wall time is 10 sec vs 6 sec. I would be happy if I can reduce 2 sec for the 2nd order method. > > -Ling > > From: Barry Smith > > Date: Sunday, March 3, 2024 at 12:06 PM > To: Zou, Ling > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] 'Preconditioning' with lower-order method > Are you forming the Jacobian for the first and second order cases inside of Newton? You can run both with -log_view to see how much time is spent in the various events (compute function, compute Jacobian, linear solve, ..?.?) for the two cases > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Are you forming the Jacobian for the first and second order cases inside of Newton? > > You can run both with -log_view to see how much time is spent in the various events (compute function, compute Jacobian, linear solve, ...) for the two cases and compare them. > > > > > On Mar 3, 2024, at 11:42?AM, Zou, Ling via petsc-users > wrote: > > Original email may have been sent to the incorrect place. > See below. > > -Ling > > From: Zou, Ling > > Date: Sunday, March 3, 2024 at 10:34 AM > To: petsc-users > > Subject: 'Preconditioning' with lower-order method > Hi all, > > I am solving a PDE system over a spatial domain. Numerical methods are: > > * Finite volume method (both 1st and 2nd order implemented) > * BDF1 and BDF2 for time integration. > What I have noticed is that 1st order FVM converges much faster than 2nd order FVM, regardless the time integration scheme. Well, not surprising since 2nd order FVM introduces additional non-linearity. > > I?m thinking about two possible ways to speed up 2nd order FVM, and would like to get some thoughts or community knowledge before jumping into code implementation. > > Say, let the 2nd order FVM residual function be F2(x) = 0; and the 1st order FVM residual function be F1(x) = 0. > > 1. Option ? 1, multi-step for each time step > Step 1: solving F1(x) = 0 to obtain a temporary solution x1 > Step 2: feed x1 as an initial guess to solve F2(x) = 0 to obtain the final solution. > [Not sure if gain any saving at all] > > > 1. Option -2, dynamically changing residual function F(x) > In pseudo code, would be something like. > > snesFormFunction(SNES snes, Vec u, Vec f, void *) > { > if (snes.nl_it_no < 4) // 4 being arbitrary here > f = F1(u); > else > f = F2(u); > } > > I know this might be a bit crazy since it may crash after switching residual function, still, any thoughts? > > Best, > > -Ling -------------- next part -------------- An HTML attachment was scrubbed... URL: From lzou at anl.gov Mon Mar 4 19:38:41 2024 From: lzou at anl.gov (Zou, Ling) Date: Tue, 5 Mar 2024 01:38:41 +0000 Subject: [petsc-users] 'Preconditioning' with lower-order method In-Reply-To: References: <8734t6woec.fsf@jedbrown.org> Message-ID: From: Zhang, Hong Date: Monday, March 4, 2024 at 6:34 PM To: Zou, Ling Cc: Jed Brown , Barry Smith , petsc-users at mcs.anl.gov Subject: Re: [petsc-users] 'Preconditioning' with lower-order method Ling, Are you using PETSc TS? If so, it may worth trying Crank-Nicolson first to see if the nonlinear solve becomes faster. >>>>> No, I?m not using TS, and I don?t plan to use CN. From my experience, when dealing with (nearly) incompressible flow problems, CN often cause (very large) pressure temporal oscillations, and to avoid that, the pressure is often using fully implicit method, so that would cause quite some code implementation issue. For the pressure oscillation issue, also see page 7 of INL/EXT-12-27197. Notes on Newton-Krylov Based Incompressible Flow Projection Solver In addition, you can try to improve the performance by pruning the Jacobian matrix. TSPruneIJacobianColor() sometimes can reduce the number of colors especially for high-order methods and make your Jacobian matrix more compact. An example of usage can be found here. If you are not using TS, there is a SNES version SNESPruneJacobianColor() for the same functionality. >>>>> The following code is how I setup the coloring. { // Create Matrix-free context MatCreateSNESMF(snes, &J_MatrixFree); // Let the problem setup Jacobian matrix sparsity p_sim->FillJacobianMatrixNonZeroEntry(P_Mat); // See PETSc examples: // https://urldefense.us/v3/__https://petsc.org/release/src/snes/tutorials/ex14.c.html__;!!G_uCfscf7eWS!aXVa0uz1LUIOdvZEPlRJOhRzz9h8MSM4vhl93kknxKGb8hkTyjCFJmSZIGr0fYx90rrqotBGdw-N3ZHE6Qw$ // https://urldefense.us/v3/__https://petsc.org/release/src/mat/tutorials/ex16.c.html__;!!G_uCfscf7eWS!aXVa0uz1LUIOdvZEPlRJOhRzz9h8MSM4vhl93kknxKGb8hkTyjCFJmSZIGr0fYx90rrqotBGdw-NcpSrhMM$ ISColoring iscoloring; MatColoring mc; MatColoringCreate(P_Mat, &mc); MatColoringSetType(mc, MATCOLORINGSL); MatColoringSetFromOptions(mc); MatColoringApply(mc, &iscoloring); MatColoringDestroy(&mc); MatFDColoringCreate(P_Mat, iscoloring, &fdcoloring); MatFDColoringSetFunction( fdcoloring, (PetscErrorCode(*)(void))(void (*)(void))snesFormFunction, this); MatFDColoringSetFromOptions(fdcoloring); MatFDColoringSetUp(P_Mat, iscoloring, fdcoloring); ISColoringDestroy(&iscoloring); // Should I prune here? Like SNESPruneJacobianColor(snes, P_Mat, P_Mat); SNESSetJacobian(snes, // snes J_MatrixFree, // Jacobian-free P_Mat, // Preconditioning matrix SNESComputeJacobianDefaultColor, // Use finite differencing and coloring fdcoloring); // fdcoloring } Thanks, -Ling Hong (Mr.) On Mar 3, 2024, at 11:48 PM, Zou, Ling via petsc-users wrote: From: Jed Brown > Date: Sunday, March 3, 2024 at 11:35 PM To: Zou, Ling >, Barry Smith > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] 'Preconditioning' with lower-order method If you're having PETSc use coloring and have confirmed that the stencil is sufficient, then it would be nonsmoothness (again, consider the limiter you've chosen) preventing quadratic convergence (assuming that doesn't kick in eventually). Note ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd If you're having PETSc use coloring and have confirmed that the stencil is sufficient, then it would be nonsmoothness (again, consider the limiter you've chosen) preventing quadratic convergence (assuming that doesn't kick in eventually). ? Yes, I do use coloring, and I do provide sufficient stencil, i.e., neighbor?s neighbor. The sufficiency is confirmed by PETSc?s -snes_test_jacobian and -snes_test_jacobian_view options. Note that assembling a Jacobian of a second order TVD operator requires at least second neighbors while the first order needs only first neighbors, thus is much sparser and needs fewer colors to compute. ? In my code implementation, when marking the Jacobian nonzero pattern, I don?t differentiate FV1 or FV2, I always use the FV2 stencil, so it?s a bit ?fat? for the FV1 method, but worked just fine. I expect you're either not exploiting that in the timings or something else is amiss. You can run with `-log_view -snes_view -ksp_converged_reason` to get a bit more information about what's happening. ? The attached is screen output as you suggest. The linear and nonlinear performance of FV2 is both worse from the output. FV2: Time Step 149, time = 13229.7, dt = 100 NL Step = 0, fnorm = 7.80968E-03 Linear solve converged due to CONVERGED_RTOL iterations 26 NL Step = 1, fnorm = 7.65731E-03 Linear solve converged due to CONVERGED_RTOL iterations 24 NL Step = 2, fnorm = 6.85034E-03 Linear solve converged due to CONVERGED_RTOL iterations 27 NL Step = 3, fnorm = 6.11873E-03 Linear solve converged due to CONVERGED_RTOL iterations 25 NL Step = 4, fnorm = 1.57347E-03 Linear solve converged due to CONVERGED_RTOL iterations 27 NL Step = 5, fnorm = 9.03536E-04 SNES Object: 1 MPI process type: newtonls maximum iterations=20, maximum function evaluations=10000 tolerances: relative=1e-08, absolute=1e-06, solution=1e-08 total number of linear solver iterations=129 total number of function evaluations=144 norm schedule ALWAYS Jacobian is applied matrix-free with differencing Preconditioning Jacobian is built using finite differences with coloring SNESLineSearch Object: 1 MPI process type: bt interpolation: cubic alpha=1.000000e-04 maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI process type: gmres restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=100, initial guess is zero tolerances: relative=0.0001, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: ilu out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift to prevent zero pivot [NONZERO] matrix ordering: rcm factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI process type: seqaij rows=8715, cols=8715 package used to perform factorization: petsc total: nonzeros=38485, allocated nonzeros=38485 not using I-node routines linear system matrix followed by preconditioner matrix: Mat Object: 1 MPI process type: mffd rows=8715, cols=8715 Matrix-free approximation: err=1.49012e-08 (relative error in function evaluation) Using wp compute h routine Does not compute normU Mat Object: 1 MPI process type: seqaij rows=8715, cols=8715 total: nonzeros=38485, allocated nonzeros=38485 total number of mallocs used during MatSetValues calls=0 not using I-node routines Solve Converged! FV1: Time Step 149, time = 13229.7, dt = 100 NL Step = 0, fnorm = 7.90072E-03 Linear solve converged due to CONVERGED_RTOL iterations 12 NL Step = 1, fnorm = 2.01919E-04 Linear solve converged due to CONVERGED_RTOL iterations 17 NL Step = 2, fnorm = 1.06960E-05 Linear solve converged due to CONVERGED_RTOL iterations 15 NL Step = 3, fnorm = 2.41683E-09 SNES Object: 1 MPI process type: newtonls maximum iterations=20, maximum function evaluations=10000 tolerances: relative=1e-08, absolute=1e-06, solution=1e-08 total number of linear solver iterations=44 total number of function evaluations=51 norm schedule ALWAYS Jacobian is applied matrix-free with differencing Preconditioning Jacobian is built using finite differences with coloring SNESLineSearch Object: 1 MPI process type: bt interpolation: cubic alpha=1.000000e-04 maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI process type: gmres restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=100, initial guess is zero tolerances: relative=0.0001, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI process type: ilu out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift to prevent zero pivot [NONZERO] matrix ordering: rcm factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI process type: seqaij rows=8715, cols=8715 package used to perform factorization: petsc total: nonzeros=38485, allocated nonzeros=38485 not using I-node routines linear system matrix followed by preconditioner matrix: Mat Object: 1 MPI process type: mffd rows=8715, cols=8715 Matrix-free approximation: err=1.49012e-08 (relative error in function evaluation) Using wp compute h routine Does not compute normU Mat Object: 1 MPI process type: seqaij rows=8715, cols=8715 total: nonzeros=38485, allocated nonzeros=38485 total number of mallocs used during MatSetValues calls=0 not using I-node routines Solve Converged! "Zou, Ling via petsc-users" > writes: > Barry, thank you. > I am not sure if I exactly follow you on this: > ?Are you forming the Jacobian for the first and second order cases inside of Newton?? > > The problem that we deal with, heat/mass transfer in heterogeneous systems (reactor system), is generally small in terms of size, i.e., # of DOFs (several k to maybe 100k level), so for now, I completely rely on PETSc to compute Jacobian, i.e., finite-differencing. > > That?s a good suggestion to see the time spent during various events. > What motivated me to try the options are the following observations. > > 2nd order FVM: > > Time Step 149, time = 13229.7, dt = 100 > > NL Step = 0, fnorm = 7.80968E-03 > > NL Step = 1, fnorm = 7.65731E-03 > > NL Step = 2, fnorm = 6.85034E-03 > > NL Step = 3, fnorm = 6.11873E-03 > > NL Step = 4, fnorm = 1.57347E-03 > > NL Step = 5, fnorm = 9.03536E-04 > > Solve Converged! > > 1st order FVM: > > Time Step 149, time = 13229.7, dt = 100 > > NL Step = 0, fnorm = 7.90072E-03 > > NL Step = 1, fnorm = 2.01919E-04 > > NL Step = 2, fnorm = 1.06960E-05 > > NL Step = 3, fnorm = 2.41683E-09 > > Solve Converged! > > Notice the obvious ?stagnant? in residual for the 2nd order method while not in the 1st order. > For the same problem, the wall time is 10 sec vs 6 sec. I would be happy if I can reduce 2 sec for the 2nd order method. > > -Ling > > From: Barry Smith > > Date: Sunday, March 3, 2024 at 12:06 PM > To: Zou, Ling > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] 'Preconditioning' with lower-order method > Are you forming the Jacobian for the first and second order cases inside of Newton? You can run both with -log_view to see how much time is spent in the various events (compute function, compute Jacobian, linear solve, ..?.?) for the two cases > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Are you forming the Jacobian for the first and second order cases inside of Newton? > > You can run both with -log_view to see how much time is spent in the various events (compute function, compute Jacobian, linear solve, ...) for the two cases and compare them. > > > > > On Mar 3, 2024, at 11:42?AM, Zou, Ling via petsc-users > wrote: > > Original email may have been sent to the incorrect place. > See below. > > -Ling > > From: Zou, Ling > > Date: Sunday, March 3, 2024 at 10:34 AM > To: petsc-users > > Subject: 'Preconditioning' with lower-order method > Hi all, > > I am solving a PDE system over a spatial domain. Numerical methods are: > > * Finite volume method (both 1st and 2nd order implemented) > * BDF1 and BDF2 for time integration. > What I have noticed is that 1st order FVM converges much faster than 2nd order FVM, regardless the time integration scheme. Well, not surprising since 2nd order FVM introduces additional non-linearity. > > I?m thinking about two possible ways to speed up 2nd order FVM, and would like to get some thoughts or community knowledge before jumping into code implementation. > > Say, let the 2nd order FVM residual function be F2(x) = 0; and the 1st order FVM residual function be F1(x) = 0. > > 1. Option ? 1, multi-step for each time step > Step 1: solving F1(x) = 0 to obtain a temporary solution x1 > Step 2: feed x1 as an initial guess to solve F2(x) = 0 to obtain the final solution. > [Not sure if gain any saving at all] > > > 1. Option -2, dynamically changing residual function F(x) > In pseudo code, would be something like. > > snesFormFunction(SNES snes, Vec u, Vec f, void *) > { > if (snes.nl_it_no < 4) // 4 being arbitrary here > f = F1(u); > else > f = F2(u); > } > > I know this might be a bit crazy since it may crash after switching residual function, still, any thoughts? > > Best, > > -Ling -------------- next part -------------- An HTML attachment was scrubbed... URL: From michal.habera at gmail.com Tue Mar 5 07:20:41 2024 From: michal.habera at gmail.com (Michal Habera) Date: Tue, 5 Mar 2024 14:20:41 +0100 Subject: [petsc-users] MUMPS Metis options Message-ID: Dear all, MUMPS allows custom configuration of the METIS library which it uses for symmetric permutations using "mumps_par%METIS OPTIONS", see p. 35 of https://urldefense.us/v3/__https://mumps-solver.org/doc/userguide_5.6.2.pdf__;!!G_uCfscf7eWS!aqzhWg_iCZLsu1jTz2sWKIpVkAIDvmvt-_4ojHy5ZtC2eWYBRlCU3zlqh72rGtso5KuDhEuf9yFxMNOGRhHoQBLH_kmf$ . Is it possible to provide these options to MUMPS via PETSc API? I can only find a way to control integer and real control parameters. -- Kind regards, Michal Habera -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Tue Mar 5 08:26:57 2024 From: pierre at joliv.et (Pierre Jolivet) Date: Tue, 5 Mar 2024 15:26:57 +0100 Subject: [petsc-users] MUMPS Metis options In-Reply-To: References: Message-ID: > On 5 Mar 2024, at 2:20?PM, Michal Habera wrote: > > This Message Is From an External Sender > This message came from outside your organization. > Dear all, > > MUMPS allows custom configuration of the METIS library which it uses > for symmetric permutations using "mumps_par%METIS OPTIONS", > see p. 35 of https://urldefense.us/v3/__https://mumps-solver.org/doc/userguide_5.6.2.pdf__;!!G_uCfscf7eWS!YlgQB0Z4fqIbHPr0NbNaJZivhlC-hjMLgRjRGox9nvwZQ1ptSyRvtQHlzqAJwdQgbUWP0HpQWM-NV5C0TiKNzA$ . > > Is it possible to provide these options to MUMPS via PETSc API? I can > only find a way to control integer and real control parameters. It is not possible, but I guess we could add something like MatMumpsSetMetisOptions(Mat A, PetscInt index, PetscInt value), just like MatMumpsSetIcntl(). Thanks, Pierre > -- > Kind regards, > > Michal Habera -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcos.vanella at nist.gov Tue Mar 5 11:51:49 2024 From: marcos.vanella at nist.gov (Vanella, Marcos (Fed)) Date: Tue, 5 Mar 2024 17:51:49 +0000 Subject: [petsc-users] Running CG with HYPRE AMG preconditioner in AMD GPUs Message-ID: Hi all, I compiled the latest PETSc source in Frontier using gcc+kokkos and hip options: ./configure COPTFLAGS="-O3" CXXOPTFLAGS="-O3" FOPTFLAGS="-O3" FCOPTFLAGS="-O3" HIPOPTFLAGS="-O3" --with-debugging=0 --with-cc=cc --with-cxx=CC --with-fc=ftn --with-hip --with-hipc=hipcc --LIBS="-L${MPICH_DIR}/lib -lmpi ${PE_MPICH_GTL_DIR_amd_gfx90a} ${PE_MPICH_GTL_LIBS_amd_gfx90a}" --download-kokkos --download-kokkos-kernels --download-suitesparse --download-hypre --download-cmake and have started testing our code solving a Poisson linear system with CG + HYPRE preconditioner. Timings look rather high compared to compilations done on other machines that have NVIDIA cards. They are also not changing when using more than one GPU for the simple test I doing. Does anyone happen to know if HYPRE has an hip GPU implementation for Boomer AMG and is it compiled when configuring PETSc? Thanks! Marcos PS: This is what I see on the log file (-log_view) when running the case with 2 GPUs in the node: ------------------------------------------------------------------ PETSc Performance Summary: ------------------------------------------------------------------ /ccs/home/vanellam/Firemodels_fork/fds/Build/mpich_gnu_frontier/fds_mpich_gnu_frontier on a arch-linux-frontier-opt-gcc named frontier04119 with 4 processors, by vanellam Tue Mar 5 12:42:29 2024 Using Petsc Development GIT revision: v3.20.5-713-gabdf6bc0fcf GIT Date: 2024-03-05 01:04:54 +0000 Max Max/Min Avg Total Time (sec): 8.368e+02 1.000 8.368e+02 Objects: 0.000e+00 0.000 0.000e+00 Flops: 2.546e+11 0.000 1.270e+11 5.079e+11 Flops/sec: 3.043e+08 0.000 1.518e+08 6.070e+08 MPI Msg Count: 1.950e+04 0.000 9.748e+03 3.899e+04 MPI Msg Len (bytes): 1.560e+09 0.000 7.999e+04 3.119e+09 MPI Reductions: 6.331e+04 2877.545 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total Count %Total Avg %Total Count %Total 0: Main Stage: 8.3676e+02 100.0% 5.0792e+11 100.0% 3.899e+04 100.0% 7.999e+04 100.0% 3.164e+04 50.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent AvgLen: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors) CpuToGpu Count: total number of CPU to GPU copies per processor CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor) GpuToCpu Count: total number of GPU to CPU copies per processor GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor) GPU %F: percent flops on GPU in this event ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F --------------------------------------------------------------------------------------------------------------------------------------------------------------- --- Event Stage 0: Main Stage BuildTwoSided 1201 0.0 nan nan 0.00e+00 0.0 2.0e+00 4.0e+00 6.0e+02 0 0 0 0 1 0 0 0 0 2 -nan -nan 0 0.00e+00 0 0.00e+00 0 BuildTwoSidedF 1200 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+02 0 0 0 0 1 0 0 0 0 2 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatMult 19494 0.0 nan nan 1.35e+11 0.0 3.9e+04 8.0e+04 0.0e+00 7 53 100 100 0 7 53 100 100 0 -nan -nan 0 1.80e-05 0 0.00e+00 100 MatConvert 3 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 1.5e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatAssemblyBegin 2 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatAssemblyEnd 2 0.0 nan nan 0.00e+00 0.0 4.0e+00 2.0e+04 3.5e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecTDot 41382 0.0 nan nan 4.14e+10 0.0 0.0e+00 0.0e+00 2.1e+04 0 16 0 0 33 0 16 0 0 65 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecNorm 20691 0.0 nan nan 2.07e+10 0.0 0.0e+00 0.0e+00 1.0e+04 0 8 0 0 16 0 8 0 0 33 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecCopy 2394 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecSet 21888 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecAXPY 38988 0.0 nan nan 3.90e+10 0.0 0.0e+00 0.0e+00 0.0e+00 0 15 0 0 0 0 15 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecAYPX 18297 0.0 nan nan 1.83e+10 0.0 0.0e+00 0.0e+00 0.0e+00 0 7 0 0 0 0 7 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecAssemblyBegin 1197 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+02 0 0 0 0 1 0 0 0 0 2 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecAssemblyEnd 1197 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecScatterBegin 19494 0.0 nan nan 0.00e+00 0.0 3.9e+04 8.0e+04 0.0e+00 0 0 100 100 0 0 0 100 100 0 -nan -nan 0 1.80e-05 0 0.00e+00 0 VecScatterEnd 19494 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SFSetGraph 1 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SFSetUp 1 0.0 nan nan 0.00e+00 0.0 4.0e+00 2.0e+04 5.0e-01 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SFPack 19494 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 1.80e-05 0 0.00e+00 0 SFUnpack 19494 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 KSPSetUp 1 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 KSPSolve 1197 0.0 2.0291e+02 0.0 2.55e+11 0.0 3.9e+04 8.0e+04 3.1e+04 12 100 100 100 49 12 100 100 100 98 2503 -nan 0 1.80e-05 0 0.00e+00 100 PCSetUp 1 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 1.5e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 PCApply 20691 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 5 0 0 0 0 5 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 --------------------------------------------------------------------------------------------------------------------------------------------------------------- Object Type Creations Destructions. Reports information only for process 0. --- Event Stage 0: Main Stage Matrix 7 3 Vector 7 1 Index Set 2 2 Star Forest Graph 1 0 Krylov Solver 1 0 Preconditioner 1 0 ======================================================================================================================== Average time to get PetscTime(): 3.01e-08 Average time for MPI_Barrier(): 3.8054e-06 Average time for zero size MPI_Send(): 7.101e-06 #PETSc Option Table entries: -log_view # (source: command line) -mat_type mpiaijkokkos # (source: command line) -vec_type kokkos # (source: command line) #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 FCOPTFLAGS=-O3 HIPOPTFLAGS=-O3 --with-debugging=0 --with-cc=cc --with-cxx=CC --with-fc=ftn --with-hip --with-hipc=hipcc --LIBS="-L/opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib -lmpi -L/opt/cray/pe/mpich/8.1.23/gtl/lib -lmpi_gtl_hsa" --download-kokkos --download-kokkos-kernels --download-suitesparse --download-hypre --download-cmake ----------------------------------------- Libraries compiled on 2024-03-05 17:04:36 on login08 Machine characteristics: Linux-5.14.21-150400.24.46_12.0.83-cray_shasta_c-x86_64-with-glibc2.3.4 Using PETSc directory: /autofs/nccs-svm1_home1/vanellam/Software/petsc Using PETSc arch: arch-linux-frontier-opt-gcc ----------------------------------------- Using C compiler: cc -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-stringop-overflow -fstack-protector -fvisibility=hidden -O3 Using Fortran compiler: ftn -fPIC -Wall -ffree-line-length-none -ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -O3 ----------------------------------------- Using include paths: -I/autofs/nccs-svm1_home1/vanellam/Software/petsc/include -I/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/include -I/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/include/suitesparse -I/opt/rocm-5.4.0/include ----------------------------------------- Using C linker: cc Using Fortran linker: ftn Using libraries: -Wl,-rpath,/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib -L/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib -lpetsc -Wl,-rpath,/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib -L/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib -Wl,-rpath,/opt/rocm-5.4.0/lib -L/opt/rocm-5.4.0/lib -Wl,-rpath,/opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib -L/opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib -Wl,-rpath,/opt/cray/pe/mpich/8.1.23/gtl/lib -L/opt/cray/pe/mpich/8.1.23/gtl/lib -Wl,-rpath,/opt/cray/pe/libsci/22.12.1.1/GNU/9.1/x86_64/lib -L/opt/cray/pe/libsci/22.12.1.1/GNU/9.1/x86_64/lib -Wl,-rpath,/sw/frontier/spack-envs/base/opt/cray-sles15-zen3/gcc-12.2.0/darshan-runtime-3.4.0-ftq5gccg3qjtyh5xeo2bz4wqkjayjhw3/lib -L/sw/frontier/spack-envs/base/opt/cray-sles15-zen3/gcc-12.2.0/darshan-runtime-3.4.0-ftq5gccg3qjtyh5xeo2bz4wqkjayjhw3/lib -Wl,-rpath,/opt/cray/pe/dsmml/0.2.2/dsmml/lib -L/opt/cray/pe/dsmml/0.2.2/dsmml/lib -Wl,-rpath,/opt/cray/pe/pmi/6.1.8/lib -L/opt/cray/pe/pmi/6.1.8/lib -Wl,-rpath,/opt/cray/xpmem/2.6.2-2.5_2.22__gd067c3f.shasta/lib64 -L/opt/cray/xpmem/2.6.2-2.5_2.22__gd067c3f.shasta/lib64 -Wl,-rpath,/opt/cray/pe/gcc/12.2.0/snos/lib/gcc/x86_64-suse-linux/12.2.0 -L/opt/cray/pe/gcc/12.2.0/snos/lib/gcc/x86_64-suse-linux/12.2.0 -Wl,-rpath,/opt/cray/pe/gcc/12.2.0/snos/lib64 -L/opt/cray/pe/gcc/12.2.0/snos/lib64 -Wl,-rpath,/opt/rocm-5.4.0/llvm/lib -L/opt/rocm-5.4.0/llvm/lib -Wl,-rpath,/opt/cray/pe/gcc/12.2.0/snos/lib -L/opt/cray/pe/gcc/12.2.0/snos/lib -lHYPRE -lspqr -lumfpack -lklu -lcholmod -lamd -lkokkoskernels -lkokkoscontainers -lkokkoscore -lkokkossimd -lhipsparse -lhipblas -lhipsolver -lrocsparse -lrocsolver -lrocblas -lrocrand -lamdhip64 -lmpi -lmpi_gtl_hsa -ldarshan -lz -ldl -lxpmem -lgfortran -lm -lmpifort_gnu_91 -lmpi_gnu_91 -lsci_gnu_82_mpi -lsci_gnu_82 -ldsmml -lpmi -lpmi2 -lgfortran -lquadmath -lpthread -lm -lgcc_s -lstdc++ -lquadmath -lmpi -lmpi_gtl_hsa ----------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Tue Mar 5 13:41:45 2024 From: mfadams at lbl.gov (Mark Adams) Date: Tue, 5 Mar 2024 14:41:45 -0500 Subject: [petsc-users] Running CG with HYPRE AMG preconditioner in AMD GPUs In-Reply-To: References: Message-ID: You can run with -log_view_gpu_time to get rid of the nans and get more data. You can run with -ksp_view to get more info on the solver and send that output. -options_left is also good to use so we can see what parameters you used. The last 100 in this row: KSPSolve 1197 0.0 2.0291e+02 0.0 2.55e+11 0.0 3.9e+04 8.0e+04 3.1e+04 12 100 100 100 49 12 100 100 100 98 2503 -nan 0 1.80e-05 0 0.00e+00 100 tells us that all the flops were logged on GPUs. You do need at least 100K equations per GPU to see speedup, so don't worry about small problems. Mark On Tue, Mar 5, 2024 at 12:52?PM Vanella, Marcos (Fed) via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi all, I compiled the latest PETSc source in Frontier using gcc+kokkos > and hip options: ./configure COPTFLAGS="-O3" CXXOPTFLAGS="-O3" > FOPTFLAGS="-O3" FCOPTFLAGS="-O3" HIPOPTFLAGS="-O3" --with-debugging=0 > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > Hi all, I compiled the latest PETSc source in Frontier using gcc+kokkos > and hip options: > > ./configure COPTFLAGS="-O3" CXXOPTFLAGS="-O3" FOPTFLAGS="-O3" > FCOPTFLAGS="-O3" HIPOPTFLAGS="-O3" --with-debugging=0 --with-cc=cc > --with-cxx=CC --with-fc=ftn --with-hip --with-hipc=hipcc > --LIBS="-L${MPICH_DIR}/lib -lmpi ${PE_MPICH_GTL_DIR_amd_gfx90a} > ${PE_MPICH_GTL_LIBS_amd_gfx90a}" --download-kokkos > --download-kokkos-kernels --download-suitesparse --download-hypre > --download-cmake > > and have started testing our code solving a Poisson linear system with CG > + HYPRE preconditioner. Timings look rather high compared to compilations > done on other machines that have NVIDIA cards. They are also not changing > when using more than one GPU for the simple test I doing. > Does anyone happen to know if HYPRE has an hip GPU implementation for > Boomer AMG and is it compiled when configuring PETSc? > > Thanks! > > Marcos > > > PS: This is what I see on the log file (-log_view) when running the case > with 2 GPUs in the node: > > > ------------------------------------------------------------------ PETSc > Performance Summary: > ------------------------------------------------------------------ > > /ccs/home/vanellam/Firemodels_fork/fds/Build/mpich_gnu_frontier/fds_mpich_gnu_frontier > on a arch-linux-frontier-opt-gcc named frontier04119 with 4 processors, by > vanellam Tue Mar 5 12:42:29 2024 > Using Petsc Development GIT revision: v3.20.5-713-gabdf6bc0fcf GIT Date: > 2024-03-05 01:04:54 +0000 > > Max Max/Min Avg Total > Time (sec): 8.368e+02 1.000 8.368e+02 > Objects: 0.000e+00 0.000 0.000e+00 > Flops: 2.546e+11 0.000 1.270e+11 5.079e+11 > Flops/sec: 3.043e+08 0.000 1.518e+08 6.070e+08 > MPI Msg Count: 1.950e+04 0.000 9.748e+03 3.899e+04 > MPI Msg Len (bytes): 1.560e+09 0.000 7.999e+04 3.119e+09 > MPI Reductions: 6.331e+04 2877.545 > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N > --> 2N flops > and VecAXPY() for complex vectors of length N > --> 8N flops > > Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages > --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total Count > %Total Avg %Total Count %Total > 0: Main Stage: 8.3676e+02 100.0% 5.0792e+11 100.0% 3.899e+04 > 100.0% 7.999e+04 100.0% 3.164e+04 50.0% > > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flop: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > AvgLen: average message length (bytes) > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and > PetscLogStagePop(). > %T - percent time in this phase %F - percent flop in this > phase > %M - percent messages in this phase %L - percent message lengths > in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over > all processors) > GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU > time over all processors) > CpuToGpu Count: total number of CPU to GPU copies per processor > CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per > processor) > GpuToCpu Count: total number of GPU to CPU copies per processor > GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per > processor) > GPU %F: percent flops on GPU in this event > > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flop > --- Global --- --- Stage ---- Total GPU - CpuToGpu - - > GpuToCpu - GPU > Max Ratio Max Ratio Max Ratio Mess AvgLen > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size > Count Size %F > > --------------------------------------------------------------------------------------------------------------------------------------------------------------- > > --- Event Stage 0: Main Stage > > BuildTwoSided 1201 0.0 nan nan 0.00e+00 0.0 2.0e+00 4.0e+00 > 6.0e+02 0 0 0 0 1 0 0 0 0 2 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > BuildTwoSidedF 1200 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 6.0e+02 0 0 0 0 1 0 0 0 0 2 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > MatMult 19494 0.0 nan nan 1.35e+11 0.0 3.9e+04 8.0e+04 > 0.0e+00 7 53 100 100 0 7 53 100 100 0 -nan -nan 0 1.80e-05 > 0 0.00e+00 100 > MatConvert 3 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.5e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > MatAssemblyBegin 2 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > MatAssemblyEnd 2 0.0 nan nan 0.00e+00 0.0 4.0e+00 2.0e+04 > 3.5e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > VecTDot 41382 0.0 nan nan 4.14e+10 0.0 0.0e+00 0.0e+00 > 2.1e+04 0 16 0 0 33 0 16 0 0 65 -nan -nan 0 0.00e+00 0 > 0.00e+00 100 > VecNorm 20691 0.0 nan nan 2.07e+10 0.0 0.0e+00 0.0e+00 > 1.0e+04 0 8 0 0 16 0 8 0 0 33 -nan -nan 0 0.00e+00 0 > 0.00e+00 100 > VecCopy 2394 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > VecSet 21888 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > VecAXPY 38988 0.0 nan nan 3.90e+10 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 15 0 0 0 0 15 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 100 > VecAYPX 18297 0.0 nan nan 1.83e+10 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 7 0 0 0 0 7 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 100 > VecAssemblyBegin 1197 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 6.0e+02 0 0 0 0 1 0 0 0 0 2 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > VecAssemblyEnd 1197 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > VecScatterBegin 19494 0.0 nan nan 0.00e+00 0.0 3.9e+04 8.0e+04 > 0.0e+00 0 0 100 100 0 0 0 100 100 0 -nan -nan 0 1.80e-05 > 0 0.00e+00 0 > VecScatterEnd 19494 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > SFSetGraph 1 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > SFSetUp 1 0.0 nan nan 0.00e+00 0.0 4.0e+00 2.0e+04 > 5.0e-01 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > SFPack 19494 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 1.80e-05 0 > 0.00e+00 0 > SFUnpack 19494 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > KSPSetUp 1 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > KSPSolve 1197 0.0 2.0291e+02 0.0 2.55e+11 0.0 3.9e+04 8.0e+04 > 3.1e+04 12 100 100 100 49 12 100 100 100 98 2503 -nan 0 1.80e-05 > 0 0.00e+00 100 > PCSetUp 1 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.5e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > PCApply 20691 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 5 0 0 0 0 5 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > > --------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Object Type Creations Destructions. Reports information only > for process 0. > > --- Event Stage 0: Main Stage > > Matrix 7 3 > Vector 7 1 > Index Set 2 2 > Star Forest Graph 1 0 > Krylov Solver 1 0 > Preconditioner 1 0 > > ======================================================================================================================== > Average time to get PetscTime(): 3.01e-08 > Average time for MPI_Barrier(): 3.8054e-06 > Average time for zero size MPI_Send(): 7.101e-06 > #PETSc Option Table entries: > -log_view # (source: command line) > -mat_type mpiaijkokkos # (source: command line) > -vec_type kokkos # (source: command line) > #End of PETSc Option Table entries > Compiled without FORTRAN kernels > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 sizeof(PetscInt) 4 > Configure options: COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 > FCOPTFLAGS=-O3 HIPOPTFLAGS=-O3 --with-debugging=0 --with-cc=cc > --with-cxx=CC --with-fc=ftn --with-hip --with-hipc=hipcc > --LIBS="-L/opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib -lmpi > -L/opt/cray/pe/mpich/8.1.23/gtl/lib -lmpi_gtl_hsa" --download-kokkos > --download-kokkos-kernels --download-suitesparse --download-hypre > --download-cmake > ----------------------------------------- > Libraries compiled on 2024-03-05 17:04:36 on login08 > Machine characteristics: > Linux-5.14.21-150400.24.46_12.0.83-cray_shasta_c-x86_64-with-glibc2.3.4 > Using PETSc directory: /autofs/nccs-svm1_home1/vanellam/Software/petsc > Using PETSc arch: arch-linux-frontier-opt-gcc > ----------------------------------------- > > Using C compiler: cc -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas > -Wno-lto-type-mismatch -Wno-stringop-overflow -fstack-protector > -fvisibility=hidden -O3 > Using Fortran compiler: ftn -fPIC -Wall -ffree-line-length-none > -ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -O3 > ----------------------------------------- > > Using include paths: > -I/autofs/nccs-svm1_home1/vanellam/Software/petsc/include > -I/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/include > -I/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/include/suitesparse > -I/opt/rocm-5.4.0/include > ----------------------------------------- > > Using C linker: cc > Using Fortran linker: ftn > Using libraries: > -Wl,-rpath,/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib > -L/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib > -lpetsc > -Wl,-rpath,/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib > -L/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib > -Wl,-rpath,/opt/rocm-5.4.0/lib -L/opt/rocm-5.4.0/lib > -Wl,-rpath,/opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib > -L/opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib > -Wl,-rpath,/opt/cray/pe/mpich/8.1.23/gtl/lib > -L/opt/cray/pe/mpich/8.1.23/gtl/lib -Wl,-rpath,/opt/cray/pe/libsci/ > 22.12.1.1/GNU/9.1/x86_64/lib -L/opt/cray/pe/libsci/ > 22.12.1.1/GNU/9.1/x86_64/lib > -Wl,-rpath,/sw/frontier/spack-envs/base/opt/cray-sles15-zen3/gcc-12.2.0/darshan-runtime-3.4.0-ftq5gccg3qjtyh5xeo2bz4wqkjayjhw3/lib > -L/sw/frontier/spack-envs/base/opt/cray-sles15-zen3/gcc-12.2.0/darshan-runtime-3.4.0-ftq5gccg3qjtyh5xeo2bz4wqkjayjhw3/lib > -Wl,-rpath,/opt/cray/pe/dsmml/0.2.2/dsmml/lib > -L/opt/cray/pe/dsmml/0.2.2/dsmml/lib -Wl,-rpath,/opt/cray/pe/pmi/6.1.8/lib > -L/opt/cray/pe/pmi/6.1.8/lib > -Wl,-rpath,/opt/cray/xpmem/2.6.2-2.5_2.22__gd067c3f.shasta/lib64 > -L/opt/cray/xpmem/2.6.2-2.5_2.22__gd067c3f.shasta/lib64 > -Wl,-rpath,/opt/cray/pe/gcc/12.2.0/snos/lib/gcc/x86_64-suse-linux/12.2.0 > -L/opt/cray/pe/gcc/12.2.0/snos/lib/gcc/x86_64-suse-linux/12.2.0 > -Wl,-rpath,/opt/cray/pe/gcc/12.2.0/snos/lib64 > -L/opt/cray/pe/gcc/12.2.0/snos/lib64 -Wl,-rpath,/opt/rocm-5.4.0/llvm/lib > -L/opt/rocm-5.4.0/llvm/lib -Wl,-rpath,/opt/cray/pe/gcc/12.2.0/snos/lib > -L/opt/cray/pe/gcc/12.2.0/snos/lib -lHYPRE -lspqr -lumfpack -lklu -lcholmod > -lamd -lkokkoskernels -lkokkoscontainers -lkokkoscore -lkokkossimd > -lhipsparse -lhipblas -lhipsolver -lrocsparse -lrocsolver -lrocblas > -lrocrand -lamdhip64 -lmpi -lmpi_gtl_hsa -ldarshan -lz -ldl -lxpmem > -lgfortran -lm -lmpifort_gnu_91 -lmpi_gnu_91 -lsci_gnu_82_mpi -lsci_gnu_82 > -ldsmml -lpmi -lpmi2 -lgfortran -lquadmath -lpthread -lm -lgcc_s -lstdc++ > -lquadmath -lmpi -lmpi_gtl_hsa > ----------------------------------------- > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcos.vanella at nist.gov Tue Mar 5 13:59:54 2024 From: marcos.vanella at nist.gov (Vanella, Marcos (Fed)) Date: Tue, 5 Mar 2024 19:59:54 +0000 Subject: [petsc-users] Running CG with HYPRE AMG preconditioner in AMD GPUs In-Reply-To: References: Message-ID: Thank you Mark, I'll try the options you suggest to get more info. I'm also building PETSc and the code with the cray compiler suite to test. The test I'm running has 1 million unknowns. I was able to see good scaling up to 4 gpus on this case in Polaris. Talk soon, Marcos ________________________________ From: Mark Adams Sent: Tuesday, March 5, 2024 2:41 PM To: Vanella, Marcos (Fed) Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Running CG with HYPRE AMG preconditioner in AMD GPUs You can run with -log_view_gpu_time to get rid of the nans and get more data. You can run with -ksp_view to get more info on the solver and send that output. -options_left is also good to use so we can see what parameters you used. The last 100 in this row: KSPSolve 1197 0.0 2.0291e+02 0.0 2.55e+11 0.0 3.9e+04 8.0e+04 3.1e+04 12 100 100 100 49 12 100 100 100 98 2503 -nan 0 1.80e-05 0 0.00e+00 100 tells us that all the flops were logged on GPUs. You do need at least 100K equations per GPU to see speedup, so don't worry about small problems. Mark On Tue, Mar 5, 2024 at 12:52?PM Vanella, Marcos (Fed) via petsc-users > wrote: Hi all, I compiled the latest PETSc source in Frontier using gcc+kokkos and hip options: ./configure COPTFLAGS="-O3" CXXOPTFLAGS="-O3" FOPTFLAGS="-O3" FCOPTFLAGS="-O3" HIPOPTFLAGS="-O3" --with-debugging=0 ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi all, I compiled the latest PETSc source in Frontier using gcc+kokkos and hip options: ./configure COPTFLAGS="-O3" CXXOPTFLAGS="-O3" FOPTFLAGS="-O3" FCOPTFLAGS="-O3" HIPOPTFLAGS="-O3" --with-debugging=0 --with-cc=cc --with-cxx=CC --with-fc=ftn --with-hip --with-hipc=hipcc --LIBS="-L${MPICH_DIR}/lib -lmpi ${PE_MPICH_GTL_DIR_amd_gfx90a} ${PE_MPICH_GTL_LIBS_amd_gfx90a}" --download-kokkos --download-kokkos-kernels --download-suitesparse --download-hypre --download-cmake and have started testing our code solving a Poisson linear system with CG + HYPRE preconditioner. Timings look rather high compared to compilations done on other machines that have NVIDIA cards. They are also not changing when using more than one GPU for the simple test I doing. Does anyone happen to know if HYPRE has an hip GPU implementation for Boomer AMG and is it compiled when configuring PETSc? Thanks! Marcos PS: This is what I see on the log file (-log_view) when running the case with 2 GPUs in the node: ------------------------------------------------------------------ PETSc Performance Summary: ------------------------------------------------------------------ /ccs/home/vanellam/Firemodels_fork/fds/Build/mpich_gnu_frontier/fds_mpich_gnu_frontier on a arch-linux-frontier-opt-gcc named frontier04119 with 4 processors, by vanellam Tue Mar 5 12:42:29 2024 Using Petsc Development GIT revision: v3.20.5-713-gabdf6bc0fcf GIT Date: 2024-03-05 01:04:54 +0000 Max Max/Min Avg Total Time (sec): 8.368e+02 1.000 8.368e+02 Objects: 0.000e+00 0.000 0.000e+00 Flops: 2.546e+11 0.000 1.270e+11 5.079e+11 Flops/sec: 3.043e+08 0.000 1.518e+08 6.070e+08 MPI Msg Count: 1.950e+04 0.000 9.748e+03 3.899e+04 MPI Msg Len (bytes): 1.560e+09 0.000 7.999e+04 3.119e+09 MPI Reductions: 6.331e+04 2877.545 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total Count %Total Avg %Total Count %Total 0: Main Stage: 8.3676e+02 100.0% 5.0792e+11 100.0% 3.899e+04 100.0% 7.999e+04 100.0% 3.164e+04 50.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent AvgLen: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors) CpuToGpu Count: total number of CPU to GPU copies per processor CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor) GpuToCpu Count: total number of GPU to CPU copies per processor GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor) GPU %F: percent flops on GPU in this event ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F --------------------------------------------------------------------------------------------------------------------------------------------------------------- --- Event Stage 0: Main Stage BuildTwoSided 1201 0.0 nan nan 0.00e+00 0.0 2.0e+00 4.0e+00 6.0e+02 0 0 0 0 1 0 0 0 0 2 -nan -nan 0 0.00e+00 0 0.00e+00 0 BuildTwoSidedF 1200 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+02 0 0 0 0 1 0 0 0 0 2 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatMult 19494 0.0 nan nan 1.35e+11 0.0 3.9e+04 8.0e+04 0.0e+00 7 53 100 100 0 7 53 100 100 0 -nan -nan 0 1.80e-05 0 0.00e+00 100 MatConvert 3 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 1.5e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatAssemblyBegin 2 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatAssemblyEnd 2 0.0 nan nan 0.00e+00 0.0 4.0e+00 2.0e+04 3.5e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecTDot 41382 0.0 nan nan 4.14e+10 0.0 0.0e+00 0.0e+00 2.1e+04 0 16 0 0 33 0 16 0 0 65 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecNorm 20691 0.0 nan nan 2.07e+10 0.0 0.0e+00 0.0e+00 1.0e+04 0 8 0 0 16 0 8 0 0 33 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecCopy 2394 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecSet 21888 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecAXPY 38988 0.0 nan nan 3.90e+10 0.0 0.0e+00 0.0e+00 0.0e+00 0 15 0 0 0 0 15 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecAYPX 18297 0.0 nan nan 1.83e+10 0.0 0.0e+00 0.0e+00 0.0e+00 0 7 0 0 0 0 7 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecAssemblyBegin 1197 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+02 0 0 0 0 1 0 0 0 0 2 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecAssemblyEnd 1197 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecScatterBegin 19494 0.0 nan nan 0.00e+00 0.0 3.9e+04 8.0e+04 0.0e+00 0 0 100 100 0 0 0 100 100 0 -nan -nan 0 1.80e-05 0 0.00e+00 0 VecScatterEnd 19494 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SFSetGraph 1 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SFSetUp 1 0.0 nan nan 0.00e+00 0.0 4.0e+00 2.0e+04 5.0e-01 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SFPack 19494 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 1.80e-05 0 0.00e+00 0 SFUnpack 19494 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 KSPSetUp 1 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 KSPSolve 1197 0.0 2.0291e+02 0.0 2.55e+11 0.0 3.9e+04 8.0e+04 3.1e+04 12 100 100 100 49 12 100 100 100 98 2503 -nan 0 1.80e-05 0 0.00e+00 100 PCSetUp 1 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 1.5e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 PCApply 20691 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 5 0 0 0 0 5 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 --------------------------------------------------------------------------------------------------------------------------------------------------------------- Object Type Creations Destructions. Reports information only for process 0. --- Event Stage 0: Main Stage Matrix 7 3 Vector 7 1 Index Set 2 2 Star Forest Graph 1 0 Krylov Solver 1 0 Preconditioner 1 0 ======================================================================================================================== Average time to get PetscTime(): 3.01e-08 Average time for MPI_Barrier(): 3.8054e-06 Average time for zero size MPI_Send(): 7.101e-06 #PETSc Option Table entries: -log_view # (source: command line) -mat_type mpiaijkokkos # (source: command line) -vec_type kokkos # (source: command line) #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 FCOPTFLAGS=-O3 HIPOPTFLAGS=-O3 --with-debugging=0 --with-cc=cc --with-cxx=CC --with-fc=ftn --with-hip --with-hipc=hipcc --LIBS="-L/opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib -lmpi -L/opt/cray/pe/mpich/8.1.23/gtl/lib -lmpi_gtl_hsa" --download-kokkos --download-kokkos-kernels --download-suitesparse --download-hypre --download-cmake ----------------------------------------- Libraries compiled on 2024-03-05 17:04:36 on login08 Machine characteristics: Linux-5.14.21-150400.24.46_12.0.83-cray_shasta_c-x86_64-with-glibc2.3.4 Using PETSc directory: /autofs/nccs-svm1_home1/vanellam/Software/petsc Using PETSc arch: arch-linux-frontier-opt-gcc ----------------------------------------- Using C compiler: cc -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-stringop-overflow -fstack-protector -fvisibility=hidden -O3 Using Fortran compiler: ftn -fPIC -Wall -ffree-line-length-none -ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -O3 ----------------------------------------- Using include paths: -I/autofs/nccs-svm1_home1/vanellam/Software/petsc/include -I/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/include -I/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/include/suitesparse -I/opt/rocm-5.4.0/include ----------------------------------------- Using C linker: cc Using Fortran linker: ftn Using libraries: -Wl,-rpath,/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib -L/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib -lpetsc -Wl,-rpath,/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib -L/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib -Wl,-rpath,/opt/rocm-5.4.0/lib -L/opt/rocm-5.4.0/lib -Wl,-rpath,/opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib -L/opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib -Wl,-rpath,/opt/cray/pe/mpich/8.1.23/gtl/lib -L/opt/cray/pe/mpich/8.1.23/gtl/lib -Wl,-rpath,/opt/cray/pe/libsci/22.12.1.1/GNU/9.1/x86_64/lib -L/opt/cray/pe/libsci/22.12.1.1/GNU/9.1/x86_64/lib -Wl,-rpath,/sw/frontier/spack-envs/base/opt/cray-sles15-zen3/gcc-12.2.0/darshan-runtime-3.4.0-ftq5gccg3qjtyh5xeo2bz4wqkjayjhw3/lib -L/sw/frontier/spack-envs/base/opt/cray-sles15-zen3/gcc-12.2.0/darshan-runtime-3.4.0-ftq5gccg3qjtyh5xeo2bz4wqkjayjhw3/lib -Wl,-rpath,/opt/cray/pe/dsmml/0.2.2/dsmml/lib -L/opt/cray/pe/dsmml/0.2.2/dsmml/lib -Wl,-rpath,/opt/cray/pe/pmi/6.1.8/lib -L/opt/cray/pe/pmi/6.1.8/lib -Wl,-rpath,/opt/cray/xpmem/2.6.2-2.5_2.22__gd067c3f.shasta/lib64 -L/opt/cray/xpmem/2.6.2-2.5_2.22__gd067c3f.shasta/lib64 -Wl,-rpath,/opt/cray/pe/gcc/12.2.0/snos/lib/gcc/x86_64-suse-linux/12.2.0 -L/opt/cray/pe/gcc/12.2.0/snos/lib/gcc/x86_64-suse-linux/12.2.0 -Wl,-rpath,/opt/cray/pe/gcc/12.2.0/snos/lib64 -L/opt/cray/pe/gcc/12.2.0/snos/lib64 -Wl,-rpath,/opt/rocm-5.4.0/llvm/lib -L/opt/rocm-5.4.0/llvm/lib -Wl,-rpath,/opt/cray/pe/gcc/12.2.0/snos/lib -L/opt/cray/pe/gcc/12.2.0/snos/lib -lHYPRE -lspqr -lumfpack -lklu -lcholmod -lamd -lkokkoskernels -lkokkoscontainers -lkokkoscore -lkokkossimd -lhipsparse -lhipblas -lhipsolver -lrocsparse -lrocsolver -lrocblas -lrocrand -lamdhip64 -lmpi -lmpi_gtl_hsa -ldarshan -lz -ldl -lxpmem -lgfortran -lm -lmpifort_gnu_91 -lmpi_gnu_91 -lsci_gnu_82_mpi -lsci_gnu_82 -ldsmml -lpmi -lpmi2 -lgfortran -lquadmath -lpthread -lm -lgcc_s -lstdc++ -lquadmath -lmpi -lmpi_gtl_hsa ----------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From ctchengben at mail.scut.edu.cn Wed Mar 6 00:39:02 2024 From: ctchengben at mail.scut.edu.cn (=?UTF-8?B?56iL5aWU?=) Date: Wed, 6 Mar 2024 14:39:02 +0800 (GMT+08:00) Subject: [petsc-users] Compile Error in configuring PETSc with Cygwin on Windows by using Intel MPI Message-ID: <584661e7.10133.18e127ca59c.Coremail.ctchengben@mail.scut.edu.cn> Hello, Last time I installed PETSc 3.19.2 with Cygwin in Windows10 successfully. Recently I try to install PETSc 3.13.6 with Cygwin since I'd like to use PETSc with Visual Studio on Windows10 plateform.For the sake of clarity, I firstly list the softwares/packages used below: 1. PETSc: version 3.13.6 2. VS: version 2022 3. Intel MPI: download Intel oneAPI Base Toolkit and HPC Toolkit 4. Cygwin 5. External package: petsc-pkg-fblaslapack-e8a03f57d64c.tar.gz And the compiler option in configuration is: ./configure --with-debugging=0 --with-cc='win32fe cl' --with-fc='win32fe ifort' --with-cxx='win32fe cl' --download-fblaslapack=/cygdrive/g/mypetsc/petsc-pkg-fblaslapack-e8a03f57d64c.tar.gz --with-shared-libraries=0 --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include --with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/lib/release/impi.lib --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec Then I build PETSc libraries with: make PETSC_DIR=/cygdrive/g/mypetsc/petsc-3.13.6 PETSC_ARCH=arch-mswin-c-opt all but there return an error: **************************ERROR************************************* Error during compile, check arch-mswin-c-opt/lib/petsc/conf/make.log Send it and arch-mswin-c-opt/lib/petsc/conf/configure.log to petsc-maint at mcs.anl.gov ******************************************************************** So I wrrit this email to report my problem and ask for your help. Looking forward your reply! sinserely, Cheng. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: configure.log URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: make.log URL: From balay at mcs.anl.gov Wed Mar 6 04:21:45 2024 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 6 Mar 2024 04:21:45 -0600 (CST) Subject: [petsc-users] Compile Error in configuring PETSc with Cygwin on Windows by using Intel MPI In-Reply-To: <584661e7.10133.18e127ca59c.Coremail.ctchengben@mail.scut.edu.cn> References: <584661e7.10133.18e127ca59c.Coremail.ctchengben@mail.scut.edu.cn> Message-ID: <1a9700e2-58bb-c5c5-94dd-8f80fa962ddf@mcs.anl.gov> > make[3]: *** No rule to make target 'w'. Stop. Try the following to overcome the above error: make OMAKE_PRINTDIR=make all However 3.13.6 is a bit old - so don't know if it will work with these versions of compilers. Satish On Wed, 6 Mar 2024, ?? wrote: > Hello, > > > Last time I installed PETSc 3.19.2 with Cygwin in Windows10 successfully. > > Recently I try to install PETSc 3.13.6 with Cygwin since I'd like to use PETSc with Visual Studio on Windows10 plateform.For the sake of clarity, I firstly list the softwares/packages used below: > > 1. PETSc: version 3.13.6 > 2. VS: version 2022 > 3. Intel MPI: download Intel oneAPI Base Toolkit and HPC Toolkit > > > 4. Cygwin > > 5. External package: petsc-pkg-fblaslapack-e8a03f57d64c.tar.gz > > > > > > > > > > > And the compiler option in configuration is: > > ./configure --with-debugging=0 --with-cc='win32fe cl' --with-fc='win32fe ifort' --with-cxx='win32fe cl' > > --download-fblaslapack=/cygdrive/g/mypetsc/petsc-pkg-fblaslapack-e8a03f57d64c.tar.gz --with-shared-libraries=0 > > --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include --with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/lib/release/impi.lib > > --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec > > > > > Then I build PETSc libraries with: > > make PETSC_DIR=/cygdrive/g/mypetsc/petsc-3.13.6 PETSC_ARCH=arch-mswin-c-opt all > > > > > > > > but there return an error: > > **************************ERROR************************************* > Error during compile, check arch-mswin-c-opt/lib/petsc/conf/make.log > Send it and arch-mswin-c-opt/lib/petsc/conf/configure.log to petsc-maint at mcs.anl.gov > ******************************************************************** > > > > > > So I wrrit this email to report my problem and ask for your help. > > > Looking forward your reply! > > > sinserely, > Cheng. > > > > From Eric.Chamberland at giref.ulaval.ca Wed Mar 6 13:57:43 2024 From: Eric.Chamberland at giref.ulaval.ca (Eric Chamberland) Date: Wed, 6 Mar 2024 14:57:43 -0500 Subject: [petsc-users] Help with SLEPc eigenvectors convergence. Message-ID: <0e5db004-1a0e-4a9a-91e1-3e6568c294c4@giref.ulaval.ca> An HTML attachment was scrubbed... URL: -------------- next part -------------- 1 EPS nconv=8 Values (Errors) 1.72355e-12 (0.00000000e+00) 4.83987e-12 (0.00000000e+00) 7.16714e-12 (0.00000000e+00) 199.286 (1.47893576e-39) 415.653 (1.99313185e-32) 570.22 (3.09568215e-29) 1294.72 (7.71230444e-19) 1295.67 (7.95775669e-19) 3769.79 (1.62599075e-08) 3771.16 (7.52750700e-09) 4323.24 (8.61648126e-09) 5141.1 (1.25820965e-08) 6023.68 (1.91065876e-05) 11729.6 (1.18367157e-02) 11845.2 (1.49674912e-02) 14148.3 (4.04353216e-02) 15071.8 (3.24127647e-02) 18157.1 (1.10833352e-01) 25128.7 (5.14116230e-02) 32499.7 (1.88695529e-01) 48189.7 (1.41699930e-01) 86602.1 (2.54292847e-01) 150776. (5.17558147e-01) 356904. (9.25768638e-01) 1.79343e+06 (1.80518276e+00) 2 EPS nconv=12 Values (Errors) 1.72355e-12 (0.00000000e+00) 4.83987e-12 (0.00000000e+00) 7.16714e-12 (0.00000000e+00) 199.286 (1.47893576e-39) 415.653 (1.99313185e-32) 570.22 (3.09568215e-29) 1294.72 (7.71230444e-19) 1295.67 (7.95775669e-19) 3769.79 (1.36933620e-19) 3771.16 (6.36276545e-20) 4323.24 (2.95572007e-19) 5141.1 (2.67431497e-18) 6023.68 (2.28621509e-14) 11727.3 (8.57510515e-08) 11841.5 (1.21364008e-07) 14111.7 (1.44584686e-05) 15046.9 (1.24205257e-03) 17784.4 (2.90225977e-03) 24962.4 (8.04532077e-02) 29636.1 (7.20333497e-02) 41881.1 (1.57409624e-01) 56572.2 (3.00061077e-01) 111415. (5.91435311e-01) 277740. (9.37846805e-01) 2.06838e+06 (2.35079865e+00) EPS Object: SolveurEPSGen (options_eps_gen) 1 MPI process type: krylovschur 50% of basis vectors kept after restart using the locking variant problem type: generalized symmetric eigenvalue problem selected portion of the spectrum: closest to target: 0. (in magnitude) postprocessing eigenvectors with purification computing all residuals (for tracking convergence) number of eigenvalues (nev): 10 number of column vectors (ncv): 25 maximum dimension of projected problem (mpd): 25 maximum number of iterations: 10000 tolerance: 1e-14 convergence test: relative to the eigenvalue BV Object: (options_eps_gen) 1 MPI process type: svec 26 columns of global length 150 vector orthogonalization method: modified Gram-Schmidt orthogonalization refinement: if needed (eta: 0.7071) block orthogonalization method: GS non-standard inner product tolerance for definite inner product: 2.22045e-15 inner product matrix: Mat Object: (MatB) 1 MPI process type: seqaij rows=150, cols=150 total: nonzeros=4932, allocated nonzeros=4932 total number of mallocs used during MatSetValues calls=0 using I-node routines: found 50 nodes, limit used is 5 doing matmult as a single matrix-matrix product DS Object: (options_eps_gen) 1 MPI process type: hep solving the problem with: Implicit QR method (_steqr) ST Object: (options_eps_gen) 1 MPI process type: sinvert shift: 0. number of matrices: 2 nonzero pattern of the matrices: UNKNOWN KSP Object: (options_eps_genst_) 1 MPI process type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-08, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (options_eps_genst_) 1 MPI process type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: external factor fill ratio given 0., needed 0. Factored matrix follows: Mat Object: (options_eps_genst_) 1 MPI process type: mumps rows=150, cols=150 package used to perform factorization: mumps total: nonzeros=8028, allocated nonzeros=8028 MUMPS run parameters: Use -options_eps_genst_ksp_view ::ascii_info_detail to display information for all processes RINFOG(1) (global estimated flops for the elimination after analysis): 244937. RINFOG(2) (global estimated flops for the assembly after factorization): 7740. RINFOG(3) (global estimated flops for the elimination after factorization): 244937. (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0.,0.)*(2^0) INFOG(3) (estimated real workspace for factors on all processors after analysis): 8028 INFOG(4) (estimated integer workspace for factors on all processors after analysis): 1236 INFOG(5) (estimated maximum front size in the complete tree): 54 INFOG(6) (number of nodes in the complete tree): 15 INFOG(7) (ordering option effectively used after analysis): 2 INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100 INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 8028 INFOG(10) (total integer space store the matrix factors after factorization): 1236 INFOG(11) (order of largest frontal matrix after factorization): 54 INFOG(12) (number of off-diagonal pivots): 0 INFOG(13) (number of delayed pivots after factorization): 0 INFOG(14) (number of memory compress after factorization): 0 INFOG(15) (number of steps of iterative refinement after solution): 0 INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 0 INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 0 INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 0 INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 0 INFOG(20) (estimated number of entries in the factors): 8028 INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 0 INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 0 INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 INFOG(28) (after factorization: number of null pivots encountered): 0 INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 8028 INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 0, 0 INFOG(32) (after analysis: type of analysis done): 1 INFOG(33) (value used for ICNTL(8)): 7 INFOG(34) (exponent of the determinant if determinant is requested): 0 INFOG(35) (after factorization: number of entries taking into account BLR factor compression - sum over all processors): 8028 INFOG(36) (after analysis: estimated size of all MUMPS internal data for running BLR in-core - value on the most memory consuming processor): 0 INFOG(37) (after analysis: estimated size of all MUMPS internal data for running BLR in-core - sum over all processors): 0 INFOG(38) (after analysis: estimated size of all MUMPS internal data for running BLR out-of-core - value on the most memory consuming processor): 0 INFOG(39) (after analysis: estimated size of all MUMPS internal data for running BLR out-of-core - sum over all processors): 0 linear system matrix = precond matrix: Mat Object: (MatA) 1 MPI process type: seqaij rows=150, cols=150 total: nonzeros=4932, allocated nonzeros=4932 total number of mallocs used during MatSetValues calls=0 using I-node routines: found 50 nodes, limit used is 5 options_eps_gen Linear eigensolve converged (12 eigenpairs) due to CONVERGED_TOL; iterations 2 Problem: some of the first 10 relative errors are higher than the tolerance -------------- next part -------------- A non-text attachment was scrubbed... Name: matrice0_gen.m Type: text/x-matlab Size: 154373 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: matrice1_gen.m Type: text/x-matlab Size: 153890 bytes Desc: not available URL: From pierre at joliv.et Thu Mar 7 01:01:00 2024 From: pierre at joliv.et (Pierre Jolivet) Date: Thu, 7 Mar 2024 08:01:00 +0100 Subject: [petsc-users] Help with SLEPc eigenvectors convergence. In-Reply-To: <0e5db004-1a0e-4a9a-91e1-3e6568c294c4@giref.ulaval.ca> References: <0e5db004-1a0e-4a9a-91e1-3e6568c294c4@giref.ulaval.ca> Message-ID: <27EBC7C0-7C48-4C74-B99D-8F52E859F1B0@joliv.et> It seems your A is rank-deficient. If you slightly regularize the GEVP, e.g., -st_target 1.0E-6, you?ll get errors closer to 0. Thanks, Pierre > On 6 Mar 2024, at 8:57?PM, Eric Chamberland via petsc-users wrote: > > This Message Is From an External Sender > This message came from outside your organization. > Hi, > > we have a simple generalized Hermitian problem (Kirchhoff plate > vibration) for which we are comparing SLEPc results with Matlab results. > > SLEPc computes eigenvalues correctly, as Matlab does. > > However, the output eigenvectors are not fully converged and we are > trying to understand where we have missed a convergence parameter or > anything else about eigenvectors. > > SLEPc warns us at the end of EPSSolve with this message: > > --- > Problem: some of the first 10 relative errors are higher than the > tolerance > --- > > And in fact, when we import the resulting vectors into Matlab, > "A*x-B*Lambda*x" isn't close to 0. > > Here are attached the EPS view output as the A and B matrices used. > > Any help or insights will be appreciated! :) > > Thanks, > Eric > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Thu Mar 7 01:24:31 2024 From: jroman at dsic.upv.es (Jose E. Roman) Date: Thu, 7 Mar 2024 08:24:31 +0100 Subject: [petsc-users] Help with SLEPc eigenvectors convergence. In-Reply-To: <27EBC7C0-7C48-4C74-B99D-8F52E859F1B0@joliv.et> References: <0e5db004-1a0e-4a9a-91e1-3e6568c294c4@giref.ulaval.ca> <27EBC7C0-7C48-4C74-B99D-8F52E859F1B0@joliv.et> Message-ID: <2115731A-5434-4C1F-85EB-614366ABB75B@dsic.upv.es> An HTML attachment was scrubbed... URL: From hellyj at ucsd.edu Thu Mar 7 01:35:14 2024 From: hellyj at ucsd.edu (John Helly) Date: Wed, 6 Mar 2024 21:35:14 -1000 Subject: [petsc-users] Unsubscribe In-Reply-To: <27EBC7C0-7C48-4C74-B99D-8F52E859F1B0@joliv.et> References: <0e5db004-1a0e-4a9a-91e1-3e6568c294c4@giref.ulaval.ca> <27EBC7C0-7C48-4C74-B99D-8F52E859F1B0@joliv.et> Message-ID: An HTML attachment was scrubbed... URL: From badi.hamid at gmail.com Thu Mar 7 02:27:00 2024 From: badi.hamid at gmail.com (hamid badi) Date: Thu, 7 Mar 2024 09:27:00 +0100 Subject: [petsc-users] Compile Error in configuring PETSc with Cygwin on Windows by using Intel MPI In-Reply-To: <584661e7.10133.18e127ca59c.Coremail.ctchengben@mail.scut.edu.cn> References: <584661e7.10133.18e127ca59c.Coremail.ctchengben@mail.scut.edu.cn> Message-ID: Hello If I find any time, I'll write a complete guide about PETSC (and additional modules like MUMPS) compilation with windows/visual studio/intel, is there other person interested ? Le mer. 6 mars 2024 ? 07:40, ?? a ?crit : > Hello, Last time I installed PETSc 3. 19. 2 with Cygwin in Windows10 > successfully. Recently I try to install PETSc 3. 13. 6 with Cygwin since > I'd like to use PETSc with Visual Studio on Windows10 plateform. For the > sake of clarity, I firstly list > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > Hello, > > Last time I installed PETSc 3.19.2 with Cygwin in Windows10 successfully. > > Recently I try to install PETSc 3.13.6 with Cygwin since I'd like to use > PETSc with Visual Studio on Windows10 plateform.For the sake of clarity, I > firstly list the softwares/packages used below: > 1. PETSc: version 3.13.6 > 2. VS: version 2022 > 3. Intel MPI: download Intel oneAPI Base Toolkit and HPC Toolkit > > 4. Cygwin > > 5. External package: petsc-pkg-fblaslapack-e8a03f57d64c.tar.gz > > > > > And the compiler option in configuration is: > > ./configure --with-debugging=0 --with-cc='win32fe cl' --with-fc='win32fe > ifort' --with-cxx='win32fe cl' > > --download-fblaslapack=/cygdrive/g/mypetsc/petsc-pkg-fblaslapack-e8a03f57d64c.tar.gz > --with-shared-libraries=0 > > --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include > --with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/lib/release/impi.lib > > --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec > > > Then I build PETSc libraries with: > > make PETSC_DIR=/cygdrive/g/mypetsc/petsc-3.13.6 > PETSC_ARCH=arch-mswin-c-opt all > > > > but there return an error: > > **************************ERROR************************************* > Error during compile, check arch-mswin-c-opt/lib/petsc/conf/make.log > Send it and arch-mswin-c-opt/lib/petsc/conf/configure.log to > petsc-maint at mcs.anl.gov > ******************************************************************** > > > So I wrrit this email to report my problem and ask for your help. > > Looking forward your reply! > > > sinserely, > Cheng. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Eric.Chamberland at giref.ulaval.ca Thu Mar 7 09:29:25 2024 From: Eric.Chamberland at giref.ulaval.ca (Eric Chamberland) Date: Thu, 7 Mar 2024 10:29:25 -0500 Subject: [petsc-users] Help with SLEPc eigenvectors convergence. In-Reply-To: <2115731A-5434-4C1F-85EB-614366ABB75B@dsic.upv.es> References: <0e5db004-1a0e-4a9a-91e1-3e6568c294c4@giref.ulaval.ca> <27EBC7C0-7C48-4C74-B99D-8F52E859F1B0@joliv.et> <2115731A-5434-4C1F-85EB-614366ABB75B@dsic.upv.es> Message-ID: Hi, wow, it was only that, it's now working perfectly! Curiously, I still have 2 times the message about relative errors followed by a new one with converged values, see: options_eps_gen Linear eigensolve converged (13 eigenpairs) due to CONVERGED_TOL; iterations 2 Problem: some of the first 10 relative errors are higher than the tolerance Problem: some of the first 10 relative errors are higher than the tolerance All requested eigenvalues computed up to the required tolerance: ????0.00000, 0.00000, 0.00000, 199.28609, 415.65289, 570.21994, 1294.72406, 1295.67360, ????3769.78800, 3771.15894 So is it intended to write these 2 warnings? Anyway thanks a lot! :) Eric On 2024-03-07 02:24, Jose E. Roman wrote: > Pierre's diagnostic is right, but the suggested option is wrong: it should be -eps_target > Also, I would suggest using a larger value, such as -eps_target 0.1, otherwise the tolerance 1e-14 might not be attained. > > Jose > > >> El 7 mar 2024, a las 8:01, Pierre Jolivet escribi?: >> >> This Message Is From an External Sender >> This message came from outside your organization. >> It seems your A is rank-deficient. >> If you slightly regularize the GEVP, e.g., -st_target 1.0E-6, you?ll get errors closer to 0. >> >> Thanks, >> Pierre >> >>> On 6 Mar 2024, at 8:57?PM, Eric Chamberland via petsc-users wrote: >>> >>> This Message Is From an External Sender >>> This message came from outside your organization. >>> Hi, >>> >>> we have a simple generalized Hermitian problem (Kirchhoff plate >>> vibration) for which we are comparing SLEPc results with Matlab results. >>> >>> SLEPc computes eigenvalues correctly, as Matlab does. >>> >>> However, the output eigenvectors are not fully converged and we are >>> trying to understand where we have missed a convergence parameter or >>> anything else about eigenvectors. >>> >>> SLEPc warns us at the end of EPSSolve with this message: >>> >>> --- >>> Problem: some of the first 10 relative errors are higher than the >>> tolerance >>> --- >>> >>> And in fact, when we import the resulting vectors into Matlab, >>> "A*x-B*Lambda*x" isn't close to 0. >>> >>> Here are attached the EPS view output as the A and B matrices used. >>> >>> Any help or insights will be appreciated! :) >>> >>> Thanks, >>> Eric >>> >>> -- Eric Chamberland, ing., M. Ing Professionnel de recherche GIREF/Universit? Laval (418) 656-2131 poste 41 22 42 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Thu Mar 7 09:36:17 2024 From: jroman at dsic.upv.es (Jose E. Roman) Date: Thu, 7 Mar 2024 16:36:17 +0100 Subject: [petsc-users] Help with SLEPc eigenvectors convergence. In-Reply-To: References: <0e5db004-1a0e-4a9a-91e1-3e6568c294c4@giref.ulaval.ca> <27EBC7C0-7C48-4C74-B99D-8F52E859F1B0@joliv.et> <2115731A-5434-4C1F-85EB-614366ABB75B@dsic.upv.es> Message-ID: An HTML attachment was scrubbed... URL: From Eric.Chamberland at giref.ulaval.ca Thu Mar 7 10:45:53 2024 From: Eric.Chamberland at giref.ulaval.ca (Eric Chamberland) Date: Thu, 7 Mar 2024 11:45:53 -0500 Subject: [petsc-users] Help with SLEPc eigenvectors convergence. In-Reply-To: References: <0e5db004-1a0e-4a9a-91e1-3e6568c294c4@giref.ulaval.ca> <27EBC7C0-7C48-4C74-B99D-8F52E859F1B0@joliv.et> <2115731A-5434-4C1F-85EB-614366ABB75B@dsic.upv.es> Message-ID: <07cb04f5-b599-44af-8d43-d4050b5d7d9a@giref.ulaval.ca> An HTML attachment was scrubbed... URL: From david.bold at ipp.mpg.de Thu Mar 7 19:27:56 2024 From: david.bold at ipp.mpg.de (David Bold) Date: Fri, 8 Mar 2024 02:27:56 +0100 Subject: [petsc-users] Broken links in FAQ Message-ID: An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Fri Mar 8 07:38:03 2024 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 8 Mar 2024 07:38:03 -0600 (CST) Subject: [petsc-users] Broken links in FAQ In-Reply-To: References: Message-ID: <8e7c0224-ccfb-c7aa-da17-b84296e0f443@mcs.anl.gov> Thanks for the report! The fix is at https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7343__;!!G_uCfscf7eWS!aRgkooWLFmFbCqsaZyYQixeOgy0qD1N3WlxPMXIGCCA-fhjJ6DSuGhanT-xc5iuF4tjVn4BBShyMJnqZr2I0dEo$ Satish On Fri, 8 Mar 2024, David Bold wrote: > Dear all, I noticed that the links to TS and PetscSF are broken in the FAQ on the website [1]. Unfortunately I do not have a gitlab.?com account handy, so I could not open a bug. Best, > David [1]https:?//urldefense.?us/v3/__https:?//petsc.?org/release/*doc-index-citing-petsc__;Iw!!G_uCfscf7eWS!cA8OaCw8cHcgyL6gQl2uDCphmPd-jX0gmF1qUhry6mBj_WxHWgDp5mQ5tEdwo7zb84CgpHPnXFxh6-BjQVF1D > V6mN3Wj$ > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > ? > ZjQcmQRYFpfptBannerEnd > > Dear all, > > I noticed that the links to TS and PetscSF are broken in the FAQ on the > website [1]. > > Unfortunately I do not have a gitlab.com account handy, so I could not > open a bug. > > Best, > David > > [1] https://urldefense.us/v3/__https://petsc.org/release/*doc-index-citing-petsc__;Iw!!G_uCfscf7eWS!cA8OaCw8cHcgyL6gQl2uDCphmPd-jX0gmF1qUhry6mBj_WxHWgDp5mQ5tEdwo7zb84CgpHPnXFxh6-BjQVF1DV > 6mN3Wj$ > > From balay at mcs.anl.gov Fri Mar 8 10:04:06 2024 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 8 Mar 2024 10:04:06 -0600 (CST) Subject: [petsc-users] Broken links in FAQ In-Reply-To: <8e7c0224-ccfb-c7aa-da17-b84296e0f443@mcs.anl.gov> References: <8e7c0224-ccfb-c7aa-da17-b84296e0f443@mcs.anl.gov> Message-ID: <7934475c-0ceb-1bab-5221-6f44231db1a5@mcs.anl.gov> The website is now updated Satish On Fri, 8 Mar 2024, Satish Balay via petsc-users wrote: > Thanks for the report! The fix is at https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7343__;!!G_uCfscf7eWS!aRgkooWLFmFbCqsaZyYQixeOgy0qD1N3WlxPMXIGCCA-fhjJ6DSuGhanT-xc5iuF4tjVn4BBShyMJnqZr2I0dEo$ > > Satish > > On Fri, 8 Mar 2024, David Bold wrote: > > > Dear all, I noticed that the links to TS and PetscSF are broken in the FAQ on the website [1]. Unfortunately I do not have a gitlab.?com account handy, so I could not open a bug. Best, > > David [1]https:?//urldefense.?us/v3/__https:?//petsc.?org/release/*doc-index-citing-petsc__;Iw!!G_uCfscf7eWS!cA8OaCw8cHcgyL6gQl2uDCphmPd-jX0gmF1qUhry6mBj_WxHWgDp5mQ5tEdwo7zb84CgpHPnXFxh6-BjQVF1D > > V6mN3Wj$ > > ZjQcmQRYFpfptBannerStart > > This Message Is From an External Sender > > This message came from outside your organization. > > ? > > ZjQcmQRYFpfptBannerEnd > > > > Dear all, > > > > I noticed that the links to TS and PetscSF are broken in the FAQ on the > > website [1]. > > > > Unfortunately I do not have a gitlab.com account handy, so I could not > > open a bug. > > > > Best, > > David > > > > [1] https://urldefense.us/v3/__https://petsc.org/release/*doc-index-citing-petsc__;Iw!!G_uCfscf7eWS!cA8OaCw8cHcgyL6gQl2uDCphmPd-jX0gmF1qUhry6mBj_WxHWgDp5mQ5tEdwo7zb84CgpHPnXFxh6-BjQVF1DV > > 6mN3Wj$ > > > > > From y.hu at mpie.de Sun Mar 10 06:21:08 2024 From: y.hu at mpie.de (Yi Hu) Date: Sun, 10 Mar 2024 12:21:08 +0100 Subject: [petsc-users] snes matrix-free jacobian fails with preconditioner shape mismatch Message-ID: An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Sun Mar 10 07:55:01 2024 From: mfadams at lbl.gov (Mark Adams) Date: Sun, 10 Mar 2024 08:55:01 -0400 Subject: [petsc-users] snes matrix-free jacobian fails with preconditioner shape mismatch In-Reply-To: References: Message-ID: It looks like your input vector is the global vector, size 162, and the local matrix size is 81. Mark [1]PETSC ERROR: Preconditioner number of local rows 81 does not equal input vector size 162 On Sun, Mar 10, 2024 at 7:21?AM Yi Hu wrote: > Dear petsc team, I implemented a matrix-free jacobian, and it can run > sequentially. But running parallel I got the pc error like this (running > with mpirun -np 2, only error from rank1 is presented here) [1]PETSC ERROR: > --------------------- > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Dear petsc team, > > I implemented a matrix-free jacobian, and it can run sequentially. But > running parallel I got the pc error like this (running with mpirun -np > 2, only error from rank1 is presented here) > > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [1]PETSC ERROR: Nonconforming object sizes > [1]PETSC ERROR: Preconditioner number of local rows 81 does not equal > input vector size 162 > [1]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!ahmistzr4wD3TJ0OvI0JWxB9aVSIbP78Jcs2X_6KMb4LdoR8drLB_DkHvaguhrca22RgFer0PlyUrtdfCA$ for trouble shooting. > [1]PETSC ERROR: Petsc Release Version 3.17.3, Jun 29, 2022 > [1]PETSC ERROR: /home/yi/workspace/DAMASK_yi/bin/DAMASK_grid on a > arch-linux-c-opt named carbon-x1 by yi Sun Mar 10 12:01:46 2024 > [1]PETSC ERROR: Configure options --download-fftw --download-hdf5 > --with-hdf5-fortran-bindings --download-fblaslapack --download-chaco > --download-hypre --download-metis --download-mumps --download-parmetis > --download-scalapack --download-suitesparse --download-superlu > --download-superlu_dist --download-triangle --download-zlib > --download-cmake --with-cxx-dialect=C++11 --with-c2html=0 > --with-debugging=0 --with-ssl=0 --with-x=0 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 > FOPTFLAGS=-O3 > [1]PETSC ERROR: #1 PCApply() at > /home/yi/App/petsc-3.17.3/src/ksp/pc/interface/precon.c:424 > [1]PETSC ERROR: #2 KSP_PCApply() at > /home/yi/App/petsc-3.17.3/include/petsc/private/kspimpl.h:376 > [1]PETSC ERROR: #3 KSPInitialResidual() at > /home/yi/App/petsc-3.17.3/src/ksp/ksp/interface/itres.c:64 > [1]PETSC ERROR: #4 KSPSolve_GMRES() at > /home/yi/App/petsc-3.17.3/src/ksp/ksp/impls/gmres/gmres.c:242 > [1]PETSC ERROR: #5 KSPSolve_Private() at > /home/yi/App/petsc-3.17.3/src/ksp/ksp/interface/itfunc.c:902 > [1]PETSC ERROR: #6 KSPSolve() at > /home/yi/App/petsc-3.17.3/src/ksp/ksp/interface/itfunc.c:1078 > [1]PETSC ERROR: #7 SNESSolve_NEWTONLS() at > /home/yi/App/petsc-3.17.3/src/snes/impls/ls/ls.c:222 > [1]PETSC ERROR: #8 SNESSolve() at > /home/yi/App/petsc-3.17.3/src/snes/interface/snes.c:4756 > [1]PETSC ERROR: #9 User provided function() at User file:0 > > However, from snes matrix-free documentation > (https://urldefense.us/v3/__https://petsc.org/release/manual/snes/*matrix-free-methods__;Iw!!G_uCfscf7eWS!ahmistzr4wD3TJ0OvI0JWxB9aVSIbP78Jcs2X_6KMb4LdoR8drLB_DkHvaguhrca22RgFer0Ply6ZbywOw$), it is said > matrix-free is used with pcnone. So I assume it would not apply > preconditioner, but it did use preconditioning probably the same as my > matrix-free shell matrix. Here is how i initialize my shell matrix and > the corresponding customized multiplication. > > call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,& > int(9*product(cells(1:2))*cells3,pPETSCINT),& > int(9*product(cells(1:2))*cells3,pPETSCINT),& > F_PETSc,Jac_PETSc,err_PETSc) > call MatShellSetOperation(Jac_PETSc,MATOP_MULT,GK_op,err_PETSc) > call SNESSetDM(SNES_mech,DM_mech,err_PETSc) > call > SNESSetJacobian(SNES_mech,Jac_PETSc,Jac_PETSc,PETSC_NULL_FUNCTION,0,err_PETSc) > call SNESGetKSP(SNES_mech,ksp,err_PETSc) > call PCSetType(pc,PCNONE,err_PETSc) > > And my GK_op is like > > subroutine GK_op(Jac,dF_global,output_local,err_PETSc) > > DM :: dm_local > Vec :: dF_global, dF_local, output_local > Mat :: Jac > PetscErrorCode :: err_PETSc > > real(pREAL), pointer,dimension(:,:,:,:) :: dF_scal, output_scal > > real(pREAL), dimension(3,3,cells(1),cells(2),cells3) :: & > dF > real(pREAL), dimension(3,3,cells(1),cells(2),cells3) :: & > output > > call SNESGetDM(SNES_mech,dm_local,err_PETSc) > > call DMGetLocalVector(dm_local,dF_local,err_PETSc) > call > DMGlobalToLocalBegin(dm_local,dF_global,INSERT_VALUES,dF_local,err_PETSc) > call > DMGlobalToLocalEnd(dm_local,dF_global,INSERT_VALUES,dF_local,err_PETSc) > > call DMDAVecGetArrayReadF90(dm_local,dF_local,dF_scal,err_PETSc) > dF = reshape(dF_scal, [3,3,cells(1),cells(2),cells3]) > > ....... > > > call DMDAVecRestoreArrayF90(dm_local,output_local,output_scal,err_PETSc) > CHKERRQ(err_PETSc) > > call DMDAVecRestoreArrayF90(dm_local,dF_local,dF_scal,err_PETSc) > CHKERRQ(err_PETSc) > > end subroutine GK_op > > I checked my cells3, it corresponds to my local size, and it seems the > local size of dF_local is ok. > > I am a bit lost here to find the reason for the preconditioner bug. > Could you help me on this? Thanks. > > Best regards, > > Yi > > > > > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From y.hu at mpie.de Sun Mar 10 09:16:15 2024 From: y.hu at mpie.de (Yi Hu) Date: Sun, 10 Mar 2024 15:16:15 +0100 Subject: [petsc-users] snes matrix-free jacobian fails with preconditioner shape mismatch In-Reply-To: References: Message-ID: <1fbb5696-a7d4-4826-85dc-539082a6566f@mpie.de> Dear Mark, Thanks for your reply. I see this mismatch. In fact my global DoF is 324. It seems like I always get the local size = global Dof / np^2, np is my processor number. By the way, I used DMDASNESsetFunctionLocal() to set my form function. Is it eligible to mix DMDASNESsetFunctionLocal() and a native SNESSetJacobian()? Best, Yi On 3/10/24 13:55, Mark Adams wrote: > It looks like your input vector is the global vector, size 162, and the local matrix size is 81. > Mark > [1]PETSC ERROR: Preconditioner number of local rows 81 does not equal > input vector size 162 > > On Sun, Mar 10, 2024 at 7:21?AM Yi Hu wrote: > > Dear petsc team, I implemented a matrix-free jacobian, and it can > run sequentially. But running parallel I got the pc error like > this (running with mpirun -np 2, only error from rank1 is > presented here) [1]PETSC ERROR: --------------------- > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > ZjQcmQRYFpfptBannerEnd > > Dear petsc team, > > I implemented a matrix-free jacobian, and it can run sequentially. But > running parallel I got the pc error like this (running with mpirun -np > 2, only error from rank1 is presented here) > > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [1]PETSC ERROR: Nonconforming object sizes > [1]PETSC ERROR: Preconditioner number of local rows 81 does not equal > input vector size 162 > [1]PETSC ERROR: Seehttps://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!ahmistzr4wD3TJ0OvI0JWxB9aVSIbP78Jcs2X_6KMb4LdoR8drLB_DkHvaguhrca22RgFer0PlyUrtdfCA$ for trouble shooting. > [1]PETSC ERROR: Petsc Release Version 3.17.3, Jun 29, 2022 > [1]PETSC ERROR: /home/yi/workspace/DAMASK_yi/bin/DAMASK_grid on a > arch-linux-c-opt named carbon-x1 by yi Sun Mar 10 12:01:46 2024 > [1]PETSC ERROR: Configure options --download-fftw --download-hdf5 > --with-hdf5-fortran-bindings --download-fblaslapack --download-chaco > --download-hypre --download-metis --download-mumps --download-parmetis > --download-scalapack --download-suitesparse --download-superlu > --download-superlu_dist --download-triangle --download-zlib > --download-cmake --with-cxx-dialect=C++11 --with-c2html=0 > --with-debugging=0 --with-ssl=0 --with-x=0 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 > FOPTFLAGS=-O3 > [1]PETSC ERROR: #1 PCApply() at > /home/yi/App/petsc-3.17.3/src/ksp/pc/interface/precon.c:424 > [1]PETSC ERROR: #2 KSP_PCApply() at > /home/yi/App/petsc-3.17.3/include/petsc/private/kspimpl.h:376 > [1]PETSC ERROR: #3 KSPInitialResidual() at > /home/yi/App/petsc-3.17.3/src/ksp/ksp/interface/itres.c:64 > [1]PETSC ERROR: #4 KSPSolve_GMRES() at > /home/yi/App/petsc-3.17.3/src/ksp/ksp/impls/gmres/gmres.c:242 > [1]PETSC ERROR: #5 KSPSolve_Private() at > /home/yi/App/petsc-3.17.3/src/ksp/ksp/interface/itfunc.c:902 > [1]PETSC ERROR: #6 KSPSolve() at > /home/yi/App/petsc-3.17.3/src/ksp/ksp/interface/itfunc.c:1078 > [1]PETSC ERROR: #7 SNESSolve_NEWTONLS() at > /home/yi/App/petsc-3.17.3/src/snes/impls/ls/ls.c:222 > [1]PETSC ERROR: #8 SNESSolve() at > /home/yi/App/petsc-3.17.3/src/snes/interface/snes.c:4756 > [1]PETSC ERROR: #9 User provided function() at Userfile:0 > > However, from snes matrix-free documentation > (https://urldefense.us/v3/__https://petsc.org/release/manual/snes/*matrix-free-methods__;Iw!!G_uCfscf7eWS!ahmistzr4wD3TJ0OvI0JWxB9aVSIbP78Jcs2X_6KMb4LdoR8drLB_DkHvaguhrca22RgFer0Ply6ZbywOw$), it is said > matrix-free is used with pcnone. So I assume it would not apply > preconditioner, but it did use preconditioning probably the same as my > matrix-free shell matrix. Here is how i initialize my shell matrix and > the corresponding customized multiplication. > > ? call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,& > int(9*product(cells(1:2))*cells3,pPETSCINT),& > int(9*product(cells(1:2))*cells3,pPETSCINT),& > ????????????????????? F_PETSc,Jac_PETSc,err_PETSc) > ? call MatShellSetOperation(Jac_PETSc,MATOP_MULT,GK_op,err_PETSc) > ? call SNESSetDM(SNES_mech,DM_mech,err_PETSc) > ? call > SNESSetJacobian(SNES_mech,Jac_PETSc,Jac_PETSc,PETSC_NULL_FUNCTION,0,err_PETSc) > ? call SNESGetKSP(SNES_mech,ksp,err_PETSc) > ? call PCSetType(pc,PCNONE,err_PETSc) > > And my GK_op is like > > subroutine GK_op(Jac,dF_global,output_local,err_PETSc) > > ? DM?????????????????????????????????? :: dm_local > ? Vec????????????????????????????????? :: dF_global, dF_local, output_local > ? Mat????????????????????????????????? :: Jac > ? PetscErrorCode?????????????????????? :: err_PETSc > > ? real(pREAL), pointer,dimension(:,:,:,:) :: dF_scal, output_scal > > ? real(pREAL), dimension(3,3,cells(1),cells(2),cells3) :: & > ??? dF > ? real(pREAL), dimension(3,3,cells(1),cells(2),cells3) :: & > ??? output > > ? call SNESGetDM(SNES_mech,dm_local,err_PETSc) > > ? call DMGetLocalVector(dm_local,dF_local,err_PETSc) > ? call > DMGlobalToLocalBegin(dm_local,dF_global,INSERT_VALUES,dF_local,err_PETSc) > ? call > DMGlobalToLocalEnd(dm_local,dF_global,INSERT_VALUES,dF_local,err_PETSc) > > ? call DMDAVecGetArrayReadF90(dm_local,dF_local,dF_scal,err_PETSc) > ? dF = reshape(dF_scal, [3,3,cells(1),cells(2),cells3]) > > ....... > > > ? call DMDAVecRestoreArrayF90(dm_local,output_local,output_scal,err_PETSc) > ? CHKERRQ(err_PETSc) > > ? call DMDAVecRestoreArrayF90(dm_local,dF_local,dF_scal,err_PETSc) > ? CHKERRQ(err_PETSc) > > end subroutine GK_op > > I checked my cells3, it corresponds to my local size, and it seems the > local size of dF_local is ok. > > I am a bit lost here to find the reason for the preconditioner bug. > Could you help me on this? Thanks. > > Best regards, > > Yi > > > > > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de . > If you are not sure of the validity please contactrco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte anrco at mpie.de > ------------------------------------------------- > ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Sun Mar 10 12:44:39 2024 From: bsmith at petsc.dev (Barry Smith) Date: Sun, 10 Mar 2024 13:44:39 -0400 Subject: [petsc-users] snes matrix-free jacobian fails with preconditioner shape mismatch In-Reply-To: <1fbb5696-a7d4-4826-85dc-539082a6566f@mpie.de> References: <1fbb5696-a7d4-4826-85dc-539082a6566f@mpie.de> Message-ID: <5AA62E2D-402F-4FEB-B04A-B90336DBC6F5@petsc.dev> > On Mar 10, 2024, at 10:16?AM, Yi Hu wrote: > > This Message Is From an External Sender > This message came from outside your organization. > Dear Mark, > > Thanks for your reply. I see this mismatch. In fact my global DoF is 324. It seems like I always get the local size = global Dof / np^2, np is my processor number. By the way, I used DMDASNESsetFunctionLocal() to set my form function. Is it eligible to mix DMDASNESsetFunctionLocal() and a native SNESSetJacobian()? > Yes > Best, > > Yi > > On 3/10/24 13:55, Mark Adams wrote: >> It looks like your input vector is the global vector, size 162, and the local matrix size is 81. >> Mark >> [1]PETSC ERROR: Preconditioner number of local rows 81 does not equal >> input vector size 162 >> >> On Sun, Mar 10, 2024 at 7:21?AM Yi Hu > wrote: >>> This Message Is From an External Sender >>> This message came from outside your organization. >>> >>> Dear petsc team, >>> >>> I implemented a matrix-free jacobian, and it can run sequentially. But >>> running parallel I got the pc error like this (running with mpirun -np >>> 2, only error from rank1 is presented here) >>> >>> [1]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> [1]PETSC ERROR: Nonconforming object sizes >>> [1]PETSC ERROR: Preconditioner number of local rows 81 does not equal >>> input vector size 162 >>> [1]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!ahmistzr4wD3TJ0OvI0JWxB9aVSIbP78Jcs2X_6KMb4LdoR8drLB_DkHvaguhrca22RgFer0PlyUrtdfCA$ for trouble shooting. >>> [1]PETSC ERROR: Petsc Release Version 3.17.3, Jun 29, 2022 >>> [1]PETSC ERROR: /home/yi/workspace/DAMASK_yi/bin/DAMASK_grid on a >>> arch-linux-c-opt named carbon-x1 by yi Sun Mar 10 12:01:46 2024 >>> [1]PETSC ERROR: Configure options --download-fftw --download-hdf5 >>> --with-hdf5-fortran-bindings --download-fblaslapack --download-chaco >>> --download-hypre --download-metis --download-mumps --download-parmetis >>> --download-scalapack --download-suitesparse --download-superlu >>> --download-superlu_dist --download-triangle --download-zlib >>> --download-cmake --with-cxx-dialect=C++11 --with-c2html=0 >>> --with-debugging=0 --with-ssl=0 --with-x=0 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 >>> FOPTFLAGS=-O3 >>> [1]PETSC ERROR: #1 PCApply() at >>> /home/yi/App/petsc-3.17.3/src/ksp/pc/interface/precon.c:424 >>> [1]PETSC ERROR: #2 KSP_PCApply() at >>> /home/yi/App/petsc-3.17.3/include/petsc/private/kspimpl.h:376 >>> [1]PETSC ERROR: #3 KSPInitialResidual() at >>> /home/yi/App/petsc-3.17.3/src/ksp/ksp/interface/itres.c:64 >>> [1]PETSC ERROR: #4 KSPSolve_GMRES() at >>> /home/yi/App/petsc-3.17.3/src/ksp/ksp/impls/gmres/gmres.c:242 >>> [1]PETSC ERROR: #5 KSPSolve_Private() at >>> /home/yi/App/petsc-3.17.3/src/ksp/ksp/interface/itfunc.c:902 >>> [1]PETSC ERROR: #6 KSPSolve() at >>> /home/yi/App/petsc-3.17.3/src/ksp/ksp/interface/itfunc.c:1078 >>> [1]PETSC ERROR: #7 SNESSolve_NEWTONLS() at >>> /home/yi/App/petsc-3.17.3/src/snes/impls/ls/ls.c:222 >>> [1]PETSC ERROR: #8 SNESSolve() at >>> /home/yi/App/petsc-3.17.3/src/snes/interface/snes.c:4756 >>> [1]PETSC ERROR: #9 User provided function() at User file:0 >>> >>> However, from snes matrix-free documentation >>> (https://urldefense.us/v3/__https://petsc.org/release/manual/snes/*matrix-free-methods__;Iw!!G_uCfscf7eWS!ahmistzr4wD3TJ0OvI0JWxB9aVSIbP78Jcs2X_6KMb4LdoR8drLB_DkHvaguhrca22RgFer0Ply6ZbywOw$), it is said >>> matrix-free is used with pcnone. So I assume it would not apply >>> preconditioner, but it did use preconditioning probably the same as my >>> matrix-free shell matrix. Here is how i initialize my shell matrix and >>> the corresponding customized multiplication. >>> >>> call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,& >>> int(9*product(cells(1:2))*cells3,pPETSCINT),& >>> int(9*product(cells(1:2))*cells3,pPETSCINT),& >>> F_PETSc,Jac_PETSc,err_PETSc) >>> call MatShellSetOperation(Jac_PETSc,MATOP_MULT,GK_op,err_PETSc) >>> call SNESSetDM(SNES_mech,DM_mech,err_PETSc) >>> call >>> SNESSetJacobian(SNES_mech,Jac_PETSc,Jac_PETSc,PETSC_NULL_FUNCTION,0,err_PETSc) >>> call SNESGetKSP(SNES_mech,ksp,err_PETSc) >>> call PCSetType(pc,PCNONE,err_PETSc) >>> >>> And my GK_op is like >>> >>> subroutine GK_op(Jac,dF_global,output_local,err_PETSc) >>> >>> DM :: dm_local >>> Vec :: dF_global, dF_local, output_local >>> Mat :: Jac >>> PetscErrorCode :: err_PETSc >>> >>> real(pREAL), pointer,dimension(:,:,:,:) :: dF_scal, output_scal >>> >>> real(pREAL), dimension(3,3,cells(1),cells(2),cells3) :: & >>> dF >>> real(pREAL), dimension(3,3,cells(1),cells(2),cells3) :: & >>> output >>> >>> call SNESGetDM(SNES_mech,dm_local,err_PETSc) >>> >>> call DMGetLocalVector(dm_local,dF_local,err_PETSc) >>> call >>> DMGlobalToLocalBegin(dm_local,dF_global,INSERT_VALUES,dF_local,err_PETSc) >>> call >>> DMGlobalToLocalEnd(dm_local,dF_global,INSERT_VALUES,dF_local,err_PETSc) >>> >>> call DMDAVecGetArrayReadF90(dm_local,dF_local,dF_scal,err_PETSc) >>> dF = reshape(dF_scal, [3,3,cells(1),cells(2),cells3]) >>> >>> ....... >>> >>> >>> call DMDAVecRestoreArrayF90(dm_local,output_local,output_scal,err_PETSc) >>> CHKERRQ(err_PETSc) >>> >>> call DMDAVecRestoreArrayF90(dm_local,dF_local,dF_scal,err_PETSc) >>> CHKERRQ(err_PETSc) >>> >>> end subroutine GK_op >>> >>> I checked my cells3, it corresponds to my local size, and it seems the >>> local size of dF_local is ok. >>> >>> I am a bit lost here to find the reason for the preconditioner bug. >>> Could you help me on this? Thanks. >>> >>> Best regards, >>> >>> Yi >>> >>> >>> >>> >>> ------------------------------------------------- >>> Stay up to date and follow us on LinkedIn, Twitter and YouTube. >>> >>> Max-Planck-Institut f?r Eisenforschung GmbH >>> Max-Planck-Stra?e 1 >>> D-40237 D?sseldorf >>> >>> Handelsregister B 2533 >>> Amtsgericht D?sseldorf >>> >>> Gesch?ftsf?hrung >>> Prof. Dr. Gerhard Dehm >>> Prof. Dr. J?rg Neugebauer >>> Prof. Dr. Dierk Raabe >>> Dr. Kai de Weldige >>> >>> Ust.-Id.-Nr.: DE 11 93 58 514 >>> Steuernummer: 105 5891 1000 >>> >>> >>> Please consider that invitations and e-mails of our institute are >>> only valid if they end with ?@mpie.de . >>> If you are not sure of the validity please contact rco at mpie.de >>> >>> Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails >>> aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. >>> In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de >>> ------------------------------------------------- >>> > > > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From y.hu at mpie.de Mon Mar 11 03:48:24 2024 From: y.hu at mpie.de (Yi Hu) Date: Mon, 11 Mar 2024 09:48:24 +0100 Subject: [petsc-users] snes matrix-free jacobian fails with preconditioner shape mismatch In-Reply-To: <5AA62E2D-402F-4FEB-B04A-B90336DBC6F5@petsc.dev> References: <1fbb5696-a7d4-4826-85dc-539082a6566f@mpie.de> <5AA62E2D-402F-4FEB-B04A-B90336DBC6F5@petsc.dev> Message-ID: <850170af-808c-4a68-8aad-4825b9944260@mpie.de> Dear Barry, Thanks for your response. Now I am doing simple debugging for my customized mat_mult_op of my shell jacobian matrix. As far as I understand, because the input and output of shell jacobian are all global vectors, I need to do the global to local mapping of my input vector (dF) by myself. Before starting debugging the mapping, I first try to verify the size match of input and output. Because the input and output of a mat_mult_op of my shell matrix should have the same size. So I tried just equating them in my customized mat_mult_op. Basically like this subroutine GK_op(Jac,dF_global,output_global,err_PETSc) DM :: dm_local ! Yi: later for is,ie Vec :: dF_global Vec :: output_global PetscErrorCode :: err_PETSc real(pREAL), pointer,dimension(:,:,:,:) :: dF_scal, output_scal call SNESGetDM(SNES_mech,dm_local,err_PETSc) CHKERRQ(err_PETSc) output_global = dF_global end subroutine GK_op When I run with mpirun -np 3, it gives me similar error like previous, ?Preconditioner number of local rows 27 does not equal input vector size 81?, (I changed my problem size so the numbers are different). Maybe simple equating input and output is not valid (due to ownership of different portion of a global vector). Then it may give me different error message. In fact my global dF has size 9*3*3*3, when running on 3 processors, the local dF has size 9*3*3*1 (I split my domain in z direction). The error message seems to suggest I am using a local dF rather than a global dF. And the output and input vector sizes seems to be different. Do I miss something here? Best regards, Yi From: Barry Smith Sent: Sunday, March 10, 2024 6:45 PM To: Yi Hu Cc: Mark Adams ; petsc-users Subject: Re: [petsc-users] snes matrix-free jacobian fails with preconditioner shape mismatch On Mar 10, 2024, at 10:16?AM, Yi Hu wrote: This Message Is From an External Sender This message came from outside your organization. Dear Mark, Thanks for your reply. I see this mismatch. In fact my global DoF is 324. It seems like I always get the local size = global Dof / np^2, np is my processor number. By the way, I used DMDASNESsetFunctionLocal() to set my form function. Is it eligible to mix DMDASNESsetFunctionLocal() and a native SNESSetJacobian()? Yes Best, Yi On 3/10/24 13:55, Mark Adams wrote: It looks like your input vector is the global vector, size 162, and the local matrix size is 81.Mark[1]PETSC ERROR: Preconditioner number of local rows 81 does not equal input vector size 162 On Sun, Mar 10, 2024 at 7:21?AM Yi Hu wrote: This Message Is From an External Sender This message came from outside your organization. Dear petsc team, I implemented a matrix-free jacobian, and it can run sequentially. But running parallel I got the pc error like this (running with mpirun -np 2, only error from rank1 is presented here) [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------[1]PETSC ERROR: Nonconforming object sizes[1]PETSC ERROR: Preconditioner number of local rows 81 does not equal input vector size 162[1]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!ahmistzr4wD3TJ0OvI0JWxB9aVSIbP78Jcs2X_6KMb4LdoR8drLB_DkHvaguhrca22RgFer0PlyUrtdfCA$ for trouble shooting.[1]PETSC ERROR: Petsc Release Version 3.17.3, Jun 29, 2022[1]PETSC ERROR: /home/yi/workspace/DAMASK_yi/bin/DAMASK_grid on a arch-linux-c-opt named carbon-x1 by yi Sun Mar 10 12:01:46 2024[1]PETSC ERROR: Configure options --download-fftw --download-hdf5 --with-hdf5-fortran-bindings --download-fblaslapack --download-chaco --download-hypre --download-metis --download-mumps --download-parmetis --download-scalapack --download-suitesparse --download-superlu --download-superlu_dist --download-triangle --download-zlib --download-cmake --with-cxx-dialect=C++11 --with-c2html=0 --with-debugging=0 --with-ssl=0 --with-x=0 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3[1]PETSC ERROR: #1 PCApply() at /home/yi/App/petsc-3.17.3/src/ksp/pc/interface/precon.c:424[1]PETSC ERROR: #2 KSP_PCApply() at /home/yi/App/petsc-3.17.3/include/petsc/private/kspimpl.h:376[1]PETSC ERROR: #3 KSPInitialResidual() at /home/yi/App/petsc-3.17.3/src/ksp/ksp/interface/itres.c:64[1]PETSC ERROR: #4 KSPSolve_GMRES() at /home/yi/App/petsc-3.17.3/src/ksp/ksp/impls/gmres/gmres.c:242[1]PETSC ERROR: #5 KSPSolve_Private() at /home/yi/App/petsc-3.17.3/src/ksp/ksp/interface/itfunc.c:902[1]PETSC ERROR: #6 KSPSolve() at /home/yi/App/petsc-3.17.3/src/ksp/ksp/interface/itfunc.c:1078[1]PETSC ERROR: #7 SNESSolve_NEWTONLS() at /home/yi/App/petsc-3.17.3/src/snes/impls/ls/ls.c:222[1]PETSC ERROR: #8 SNESSolve() at /home/yi/App/petsc-3.17.3/src/snes/interface/snes.c:4756[1]PETSC ERROR: #9 User provided function() at User file:0 However, from snes matrix-free documentation (https://urldefense.us/v3/__https://petsc.org/release/manual/snes/*matrix-free-methods__;Iw!!G_uCfscf7eWS!ahmistzr4wD3TJ0OvI0JWxB9aVSIbP78Jcs2X_6KMb4LdoR8drLB_DkHvaguhrca22RgFer0Ply6ZbywOw$), it is said matrix-free is used with pcnone. So I assume it would not apply preconditioner, but it did use preconditioning probably the same as my matrix-free shell matrix. Here is how i initialize my shell matrix and the corresponding customized multiplication. call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,&int(9*product(cells(1:2))*cells3,pPETSCINT),&int(9*product(cells(1:2))*cells3,pPETSCINT),& F_PETSc,Jac_PETSc,err_PETSc) call MatShellSetOperation(Jac_PETSc,MATOP_MULT,GK_op,err_PETSc) call SNESSetDM(SNES_mech,DM_mech,err_PETSc) call SNESSetJacobian(SNES_mech,Jac_PETSc,Jac_PETSc,PETSC_NULL_FUNCTION,0,err_PETSc) call SNESGetKSP(SNES_mech,ksp,err_PETSc) call PCSetType(pc,PCNONE,err_PETSc) And my GK_op is like subroutine GK_op(Jac,dF_global,output_local,err_PETSc) DM :: dm_local Vec :: dF_global, dF_local, output_local Mat :: Jac PetscErrorCode :: err_PETSc real(pREAL), pointer,dimension(:,:,:,:) :: dF_scal, output_scal real(pREAL), dimension(3,3,cells(1),cells(2),cells3) :: & dF real(pREAL), dimension(3,3,cells(1),cells(2),cells3) :: & output call SNESGetDM(SNES_mech,dm_local,err_PETSc) call DMGetLocalVector(dm_local,dF_local,err_PETSc) call DMGlobalToLocalBegin(dm_local,dF_global,INSERT_VALUES,dF_local,err_PETSc) call DMGlobalToLocalEnd(dm_local,dF_global,INSERT_VALUES,dF_local,err_PETSc) call DMDAVecGetArrayReadF90(dm_local,dF_local,dF_scal,err_PETSc) dF = reshape(dF_scal, [3,3,cells(1),cells(2),cells3]) ....... call DMDAVecRestoreArrayF90(dm_local,output_local,output_scal,err_PETSc) CHKERRQ(err_PETSc) call DMDAVecRestoreArrayF90(dm_local,dF_local,dF_scal,err_PETSc) CHKERRQ(err_PETSc) end subroutine GK_op I checked my cells3, it corresponds to my local size, and it seems the local size of dF_local is ok. I am a bit lost here to find the reason for the preconditioner bug. Could you help me on this? Thanks. Best regards, Yi -------------------------------------------------Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbHMax-Planck-Stra?e 1D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrungProf. Dr. Gerhard DehmProf. Dr. J?rg NeugebauerProf. Dr. Dierk RaabeDr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mailsaus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de------------------------------------------------- ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Mon Mar 11 07:10:57 2024 From: mfadams at lbl.gov (Mark Adams) Date: Mon, 11 Mar 2024 08:10:57 -0400 Subject: [petsc-users] snes matrix-free jacobian fails with preconditioner shape mismatch In-Reply-To: <850170af-808c-4a68-8aad-4825b9944260@mpie.de> References: <1fbb5696-a7d4-4826-85dc-539082a6566f@mpie.de> <5AA62E2D-402F-4FEB-B04A-B90336DBC6F5@petsc.dev> <850170af-808c-4a68-8aad-4825b9944260@mpie.de> Message-ID: I think you misunderstand "global" vector. The global solution is never on one processor. A global vector in this case seems to have 81 global values and 27 local values. It looks like you create a "global" vector that has 81 local values that should never be created other than for debigging. GlobalToLocal refers to a "local" vector with ghost cells, so it would have > 27 values in this case and you use it for local operations only (its communicator is PETSC_COMM_SELF). Hope this helps, Mark On Mon, Mar 11, 2024 at 4:48?AM Yi Hu wrote: > Dear Barry, > > > > Thanks for your response. Now I am doing simple debugging for my > customized mat_mult_op of my shell jacobian matrix. As far as I understand, > because the input and output of shell jacobian are all global vectors, I > need to do the global to local mapping of my input vector (dF) by myself. > Before starting debugging the mapping, I first try to verify the size match > of input and output. > > > > Because the input and output of a mat_mult_op of my shell matrix should > have the same size. So I tried just equating them in my customized > mat_mult_op. Basically like this > > > > subroutine GK_op(Jac,dF_global,output_global,err_PETSc) > > > > DM :: dm_local ! Yi: later for is,ie > > Vec :: dF_global > > Vec :: output_global > > PetscErrorCode :: err_PETSc > > > > real(pREAL), pointer,dimension(:,:,:,:) :: dF_scal, output_scal > > > > call SNESGetDM(SNES_mech,dm_local,err_PETSc) > > CHKERRQ(err_PETSc) > > > > output_global = dF_global > > > > end subroutine GK_op > > > > When I run with mpirun -np 3, it gives me similar error like previous, > ?Preconditioner number of local rows 27 does not equal input vector size > 81?, (I changed my problem size so the numbers are different). > > > > Maybe simple equating input and output is not valid (due to ownership of > different portion of a global vector). Then it may give me different error > message. In fact my global dF has size 9*3*3*3, when running on 3 > processors, the local dF has size 9*3*3*1 (I split my domain in z > direction). The error message seems to suggest I am using a local dF rather > than a global dF. And the output and input vector sizes seems to be > different. Do I miss something here? > > > > Best regards, > > Yi > > > > > > *From:* Barry Smith > *Sent:* Sunday, March 10, 2024 6:45 PM > *To:* Yi Hu > *Cc:* Mark Adams ; petsc-users > *Subject:* Re: [petsc-users] snes matrix-free jacobian fails with > preconditioner shape mismatch > > > > > > > > On Mar 10, 2024, at 10:16?AM, Yi Hu wrote: > > > > This Message Is From an External Sender > > This message came from outside your organization. > > Dear Mark, > > Thanks for your reply. I see this mismatch. In fact my global DoF is 324. > It seems like I always get the local size = global Dof / np^2, np is my > processor number. By the way, I used DMDASNESsetFunctionLocal() to set my > form function. Is it eligible to mix DMDASNESsetFunctionLocal() and a > native SNESSetJacobian()? > > > > Yes > > > > Best, > > Yi > > On 3/10/24 13:55, Mark Adams wrote: > > It looks like your input vector is the global vector, size 162, and the local matrix size is 81. > > Mark > > [1]PETSC ERROR: Preconditioner number of local rows 81 does not equal > > input vector size 162 > > > > On Sun, Mar 10, 2024 at 7:21?AM Yi Hu wrote: > > *This Message Is From an External Sender * > > This message came from outside your organization. > > > > Dear petsc team, > > > > I implemented a matrix-free jacobian, and it can run sequentially. But > > running parallel I got the pc error like this (running with mpirun -np > > 2, only error from rank1 is presented here) > > > > [1]PETSC ERROR: --------------------- Error Message > > -------------------------------------------------------------- > > [1]PETSC ERROR: Nonconforming object sizes > > [1]PETSC ERROR: Preconditioner number of local rows 81 does not equal > > input vector size 162 > > [1]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!ahmistzr4wD3TJ0OvI0JWxB9aVSIbP78Jcs2X_6KMb4LdoR8drLB_DkHvaguhrca22RgFer0PlyUrtdfCA$ for trouble shooting. > > [1]PETSC ERROR: Petsc Release Version 3.17.3, Jun 29, 2022 > > [1]PETSC ERROR: /home/yi/workspace/DAMASK_yi/bin/DAMASK_grid on a > > arch-linux-c-opt named carbon-x1 by yi Sun Mar 10 12:01:46 2024 > > [1]PETSC ERROR: Configure options --download-fftw --download-hdf5 > > --with-hdf5-fortran-bindings --download-fblaslapack --download-chaco > > --download-hypre --download-metis --download-mumps --download-parmetis > > --download-scalapack --download-suitesparse --download-superlu > > --download-superlu_dist --download-triangle --download-zlib > > --download-cmake --with-cxx-dialect=C++11 --with-c2html=0 > > --with-debugging=0 --with-ssl=0 --with-x=0 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 > > FOPTFLAGS=-O3 > > [1]PETSC ERROR: #1 PCApply() at > > /home/yi/App/petsc-3.17.3/src/ksp/pc/interface/precon.c:424 > > [1]PETSC ERROR: #2 KSP_PCApply() at > > /home/yi/App/petsc-3.17.3/include/petsc/private/kspimpl.h:376 > > [1]PETSC ERROR: #3 KSPInitialResidual() at > > /home/yi/App/petsc-3.17.3/src/ksp/ksp/interface/itres.c:64 > > [1]PETSC ERROR: #4 KSPSolve_GMRES() at > > /home/yi/App/petsc-3.17.3/src/ksp/ksp/impls/gmres/gmres.c:242 > > [1]PETSC ERROR: #5 KSPSolve_Private() at > > /home/yi/App/petsc-3.17.3/src/ksp/ksp/interface/itfunc.c:902 > > [1]PETSC ERROR: #6 KSPSolve() at > > /home/yi/App/petsc-3.17.3/src/ksp/ksp/interface/itfunc.c:1078 > > [1]PETSC ERROR: #7 SNESSolve_NEWTONLS() at > > /home/yi/App/petsc-3.17.3/src/snes/impls/ls/ls.c:222 > > [1]PETSC ERROR: #8 SNESSolve() at > > /home/yi/App/petsc-3.17.3/src/snes/interface/snes.c:4756 > > [1]PETSC ERROR: #9 User provided function() at User file:0 > > > > However, from snes matrix-free documentation > > (https://urldefense.us/v3/__https://petsc.org/release/manual/snes/*matrix-free-methods__;Iw!!G_uCfscf7eWS!ahmistzr4wD3TJ0OvI0JWxB9aVSIbP78Jcs2X_6KMb4LdoR8drLB_DkHvaguhrca22RgFer0Ply6ZbywOw$ ), it is said > > matrix-free is used with pcnone. So I assume it would not apply > > preconditioner, but it did use preconditioning probably the same as my > > matrix-free shell matrix. Here is how i initialize my shell matrix and > > the corresponding customized multiplication. > > > > call MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,& > > int(9*product(cells(1:2))*cells3,pPETSCINT),& > > int(9*product(cells(1:2))*cells3,pPETSCINT),& > > F_PETSc,Jac_PETSc,err_PETSc) > > call MatShellSetOperation(Jac_PETSc,MATOP_MULT,GK_op,err_PETSc) > > call SNESSetDM(SNES_mech,DM_mech,err_PETSc) > > call > > SNESSetJacobian(SNES_mech,Jac_PETSc,Jac_PETSc,PETSC_NULL_FUNCTION,0,err_PETSc) > > call SNESGetKSP(SNES_mech,ksp,err_PETSc) > > call PCSetType(pc,PCNONE,err_PETSc) > > > > And my GK_op is like > > > > subroutine GK_op(Jac,dF_global,output_local,err_PETSc) > > > > DM :: dm_local > > Vec :: dF_global, dF_local, output_local > > Mat :: Jac > > PetscErrorCode :: err_PETSc > > > > real(pREAL), pointer,dimension(:,:,:,:) :: dF_scal, output_scal > > > > real(pREAL), dimension(3,3,cells(1),cells(2),cells3) :: & > > dF > > real(pREAL), dimension(3,3,cells(1),cells(2),cells3) :: & > > output > > > > call SNESGetDM(SNES_mech,dm_local,err_PETSc) > > > > call DMGetLocalVector(dm_local,dF_local,err_PETSc) > > call > > DMGlobalToLocalBegin(dm_local,dF_global,INSERT_VALUES,dF_local,err_PETSc) > > call > > DMGlobalToLocalEnd(dm_local,dF_global,INSERT_VALUES,dF_local,err_PETSc) > > > > call DMDAVecGetArrayReadF90(dm_local,dF_local,dF_scal,err_PETSc) > > dF = reshape(dF_scal, [3,3,cells(1),cells(2),cells3]) > > > > ....... > > > > > > call DMDAVecRestoreArrayF90(dm_local,output_local,output_scal,err_PETSc) > > CHKERRQ(err_PETSc) > > > > call DMDAVecRestoreArrayF90(dm_local,dF_local,dF_scal,err_PETSc) > > CHKERRQ(err_PETSc) > > > > end subroutine GK_op > > > > I checked my cells3, it corresponds to my local size, and it seems the > > local size of dF_local is ok. > > > > I am a bit lost here to find the reason for the preconditioner bug. > > Could you help me on this? Thanks. > > > > Best regards, > > > > Yi > > > > > > > > > > ------------------------------------------------- > > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > > > Max-Planck-Institut f?r Eisenforschung GmbH > > Max-Planck-Stra?e 1 > > D-40237 D?sseldorf > > Handelsregister B 2533 > > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > > Prof. Dr. Gerhard Dehm > > Prof. Dr. J?rg Neugebauer > > Prof. Dr. Dierk Raabe > > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > > Steuernummer: 105 5891 1000 > > > > > > Please consider that invitations and e-mails of our institute are > > only valid if they end with ?@mpie.de . > > If you are not sure of the validity please contact rco at mpie.de > > > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > > ------------------------------------------------- > > > > > > ------------------------------ > > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- > > > > > ------------------------------ > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marco at kit.ac.jp Tue Mar 12 03:18:07 2024 From: marco at kit.ac.jp (Marco Seiz) Date: Tue, 12 Mar 2024 17:18:07 +0900 Subject: [petsc-users] Fieldsplit, multigrid and DM interaction Message-ID: An HTML attachment was scrubbed... URL: From ctchengben at mail.scut.edu.cn Tue Mar 12 09:03:25 2024 From: ctchengben at mail.scut.edu.cn (=?UTF-8?B?56iL5aWU?=) Date: Tue, 12 Mar 2024 22:03:25 +0800 (GMT+08:00) Subject: [petsc-users] Compile Error in configuring PETSc with Cygwin on Windows by using Intel MPI In-Reply-To: <1a9700e2-58bb-c5c5-94dd-8f80fa962ddf@mcs.anl.gov> References: <584661e7.10133.18e127ca59c.Coremail.ctchengben@mail.scut.edu.cn> <1a9700e2-58bb-c5c5-94dd-8f80fa962ddf@mcs.anl.gov> Message-ID: <33679c3a.6e5.18e32f9a646.Coremail.ctchengben@mail.scut.edu.cn> An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Tue Mar 12 10:01:14 2024 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 12 Mar 2024 10:01:14 -0500 (CDT) Subject: [petsc-users] Compile Error in configuring PETSc with Cygwin on Windows by using Intel MPI In-Reply-To: <33679c3a.6e5.18e32f9a646.Coremail.ctchengben@mail.scut.edu.cn> References: <584661e7.10133.18e127ca59c.Coremail.ctchengben@mail.scut.edu.cn> <1a9700e2-58bb-c5c5-94dd-8f80fa962ddf@mcs.anl.gov> <33679c3a.6e5.18e32f9a646.Coremail.ctchengben@mail.scut.edu.cn> Message-ID: <057811a0-5782-d845-acee-6686e330708b@mcs.anl.gov> Glad you have a successful build! Thanks for the update. Satish On Tue, 12 Mar 2024, ?? wrote: > Hi Satish Sorry for replying to your email so late, I follow your suggestion and it have been installed successfully. Thank you so much. best wishes, Ben > -----????----- > ???: "Satish Balay" > ????:?2024-03-06 > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > ? > ZjQcmQRYFpfptBannerEnd > > Hi Satish > Sorry for replying to your email so late, I follow your suggestion and it have been installed successfully. > Thank you so much. > > best > wishes, > Ben > > > > -----????----- > > ???: "Satish Balay" > > ????:2024-03-06 18:21:45 (???) > > ???: ?? > > ??: petsc-users at mcs.anl.gov > > ??: Re: [petsc-users] Compile Error in configuring PETSc with Cygwin on Windows by using Intel MPI > > > > > make[3]: *** No rule to make target 'w'. Stop. > > > > Try the following to overcome the above error: > > > > make OMAKE_PRINTDIR=make all > > > > However 3.13.6 is a bit old - so don't know if it will work with these versions of compilers. > > > > Satish > > > > On Wed, 6 Mar 2024, ?? wrote: > > > > > Hello, > > > > > > > > > Last time I installed PETSc 3.19.2 with Cygwin in Windows10 successfully. > > > > > > Recently I try to install PETSc 3.13.6 with Cygwin since I'd like to use PETSc with Visual Studio on Windows10 plateform.For the sake of clarity, I firstly list the softwares/packages used below: > > > > > > 1. PETSc: version 3.13.6 > > > 2. VS: version 2022 > > > 3. Intel MPI: download Intel oneAPI Base Toolkit and HPC Toolkit > > > > > > > > > 4. Cygwin > > > > > > 5. External package: petsc-pkg-fblaslapack-e8a03f57d64c.tar.gz > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > And the compiler option in configuration is: > > > > > > ./configure --with-debugging=0 --with-cc='win32fe cl' --with-fc='win32fe ifort' --with-cxx='win32fe cl' > > > > > > --download-fblaslapack=/cygdrive/g/mypetsc/petsc-pkg-fblaslapack-e8a03f57d64c.tar.gz --with-shared-libraries=0 > > > > > > --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include --with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/lib/release/impi.lib > > > > > > --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec > > > > > > > > > > > > > > > Then I build PETSc libraries with: > > > > > > make PETSC_DIR=/cygdrive/g/mypetsc/petsc-3.13.6 PETSC_ARCH=arch-mswin-c-opt all > > > > > > > > > > > > > > > > > > > > > > > > but there return an error: > > > > > > **************************ERROR************************************* > > > Error during compile, check arch-mswin-c-opt/lib/petsc/conf/make.log > > > Send it and arch-mswin-c-opt/lib/petsc/conf/configure.log to petsc-maint at mcs.anl.gov > > > ******************************************************************** > > > > > > > > > > > > > > > > > > So I wrrit this email to report my problem and ask for your help. > > > > > > > > > Looking forward your reply! > > > > > > > > > sinserely, > > > Cheng. > > > > > > > > > > > > > > From adigitoleo at posteo.net Tue Mar 12 19:19:36 2024 From: adigitoleo at posteo.net (adigitoleo (Leon)) Date: Wed, 13 Mar 2024 00:19:36 +0000 Subject: [petsc-users] petsc4py error code 86 from ViewerHDF5().create Message-ID: An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Mar 12 19:40:09 2024 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 12 Mar 2024 20:40:09 -0400 Subject: [petsc-users] petsc4py error code 86 from ViewerHDF5().create In-Reply-To: References: Message-ID: <75C2083E-F5A6-4DD0-9EE2-782809EF1F6D@petsc.dev> You need to ./configure PETSc for HDF5 using > --with-fortran-bindings=0 --with-mpi-dir=/usr --download-hdf5 It may need additional options, if it does then rerun the ./configure with the additional options it lists. > On Mar 12, 2024, at 8:19?PM, adigitoleo (Leon) wrote: > > This Message Is From an External Sender > This message came from outside your organization. > Hello, > > I'm new to the list and have a limited knowledge of PETSc so far, but > I'm trying to use a software (underworld3) that relies on petsc4py. > I have built PETSc with the following configure options: > > --with-fortran-bindings=0 --with-mpi-dir=/usr > > and `make test` gives me 160 failures which all seem to be timeouts or > arising from my having insufficient "slots" (cores?). I subsequently > built underworld3 with something like > > cd $PETSC_DIR > PETSC_DIR=... PETSC_ARCH=... NUMPY_INCLUDE=... pip install src/binding/petsc4py > cd /path/to/underworld3/tree > pip install h5py > pip install mpi4py > PETSC_DIR=... PETSC_ARCH=... NUMPY_INCLUDE=... pip install -e . > > following their instructions. Building their python wheel/package was > successful, however when I run their tests (using pytest) I get errors > during test collection, which all come from petsc4py and have a stack > trace that ends in the snippet attached below. Am I going about this > wrong? How do I ensure that the HDF5 types are defined? > > src/underworld3/discretisation.py:86: in _from_gmsh > viewer = PETSc.ViewerHDF5().create(filename + ".h5", "w", comm=PETSc.COMM_SELF) > petsc4py/PETSc/Viewer.pyx:916: in petsc4py.PETSc.ViewerHDF5.create > ??? > E petsc4py.PETSc.Error: error code 86 > ------------------------------- Captured stderr -------------------------------- > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Unknown type. Check for miss-spelling or missing package: https://urldefense.us/v3/__https://petsc.org/release/install/install/*external-packages__;Iw!!G_uCfscf7eWS!duVp7PZwdvHgymCgufX290k3tptJCHEo3vrV7dNt9zumYwqzDVsb1AG1HHargxq0LL-1JO6mjgiS7Vbykb1_siyXWw$ > [0]PETSC ERROR: Unknown PetscViewer type given: hdf5 > [0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!duVp7PZwdvHgymCgufX290k3tptJCHEo3vrV7dNt9zumYwqzDVsb1AG1HHargxq0LL-1JO6mjgiS7Vbykb1LEFOicQ$ for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.20.4, unknown > [0]PETSC ERROR: /home/leon/vcs/underworld3/.venv-underworld3/bin/pytest on a arch-linux-c-debug named roci by leon Wed Mar 13 00:01:33 2024 > [0]PETSC ERROR: Configure options --with-fortran-bindings=0 --with-mpi-dir=/usr > [0]PETSC ERROR: #1 PetscViewerSetType() at /home/leon/vcs/petsc/src/sys/classes/viewer/interface/viewreg.c:535 > > Cheers, > Leon -------------- next part -------------- An HTML attachment was scrubbed... URL: From adigitoleo at posteo.net Tue Mar 12 22:54:40 2024 From: adigitoleo at posteo.net (adigitoleo (Leon)) Date: Wed, 13 Mar 2024 03:54:40 +0000 Subject: [petsc-users] petsc4py error code 86 from ViewerHDF5().create In-Reply-To: <75C2083E-F5A6-4DD0-9EE2-782809EF1F6D@petsc.dev> References: <75C2083E-F5A6-4DD0-9EE2-782809EF1F6D@petsc.dev> Message-ID: An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Mar 13 09:01:03 2024 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 13 Mar 2024 10:01:03 -0400 Subject: [petsc-users] petsc4py error code 86 from ViewerHDF5().create In-Reply-To: References: <75C2083E-F5A6-4DD0-9EE2-782809EF1F6D@petsc.dev> Message-ID: <785D7501-F925-4FE3-B9A2-CA7A48CE9884@petsc.dev> An HTML attachment was scrubbed... URL: From lzou at anl.gov Wed Mar 13 09:36:17 2024 From: lzou at anl.gov (Zou, Ling) Date: Wed, 13 Mar 2024 14:36:17 +0000 Subject: [petsc-users] Possible bug associated with '-pc_factor_nonzeros_along_diagonal' option? Message-ID: Dear all, For the same code and same input file, using '-pc_factor_nonzeros_along_diagonal' causing a seg fault, while '-pc_factor_shift_type nonzero -pc_factor_shift_amount 1e-8' works fine. The seg fault happened after the very first residual vec norm was evaluated, somewhere before stepping into the second residual vec norm evaluation, i.e., NL Step = 0, fnorm = 5.22084E+01 Segmentation fault: 11 I tend to believe that it is not in my `snesFormFunction` because I used print to make sure to see that the ?sef fault? is after snesFormFunction was called in the first residual vec norm was shown, and before snesFormFunction was called for the next residual vec norm. I am behind ?moose,? so the debug version did not tell much info, showing the exact error message. It?s not an urgent issue for me, since the other option worked fine. Best, -Ling -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Mar 13 14:00:27 2024 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 13 Mar 2024 15:00:27 -0400 Subject: [petsc-users] Fieldsplit, multigrid and DM interaction In-Reply-To: References: Message-ID: <197C526B-AAFD-4AF9-BF21-BBCA65EEF861@petsc.dev> Sorry no one responded to this email sooner. > On Mar 12, 2024, at 4:18?AM, Marco Seiz wrote: > > This Message Is From an External Sender > This message came from outside your organization. > Hello, > > > I'd like to solve a Stokes-like equation with PETSc, i.e. > > > div( mu * symgrad(u) ) = -grad(p) - grad(mu*q) > > div(u) = q > > > with the spatially variable coefficients (mu, q) coming from another > application, which will advect and evolve fields via the velocity field > u from the Stokes solution, and throw back new (mu, q) to PETSc in a > loop, everything using finite difference. In preparation for this and > getting used to PETSc I wrote a simple inhomogeneous coefficient Poisson > solver, i.e. > > div (mu*grad(u) = -grad(mu*q), u unknown, > > based on src/ksp/ksp/tutorials/ex32.c which converges really nicely even > for mu contrasts of 10^10 using -ksp_type fgmres -pc_type mg. Since my > coefficients later on can't be calculated from coordinates, I put them > on a separate DM and attached it to the main DM via PetscObjectCompose > and used a DMCoarsenHookAdd to coarsen the DM the coefficients live on, > inspired by src/ts/tutorials/ex29.c . > > Adding another uncoupled DoF was simple enough and it converged > according to -ksp_converged_reason, but the solution started looking > very weird; roughly constant for each DoF, when it should be some > function going from roughly -value to +value due to symmetry. This > doesn't happen when I use a direct solver ( -ksp_type preonly -pc_type > lu -pc_factor_mat_solver_type umfpack ) and reading the archives, I > ought to be using -pc_type fieldsplit due to the block nature of the > matrix. I did that and the solution looked sensible again. Hmm, this sounds like the operator has the constant null space that is accumulating in the iterative method. The standard why to handle this is to use MatSetNullSpace() to provide the nullspace information so the iterative solver can remove it at each iteration. > > Now here comes the actual problem: Once I try adding multigrid > preconditioning to the split fields I get errors probably relating to > fieldsplit not "inheriting" (for lack of a better term) the associated > interpolations/added DMs and hooks on the fine DM. That is, when I use > the DMDA_Q0 interpolation, fieldsplit dies because it switches to > DMDA_Q1 and the size ratio is wrong ( Ratio between levels: (mx - 1)/(Mx > - 1) must be integer: mx 64 Mx 32 ). When I use DMDA_Q1, once the KSP > tries to setup the matrix on the coarsened problem the DM no longer has > the coefficient DMs which I previously had associated with it, i.e. > PetscCall(PetscObjectQuery((PetscObject)da, "coefficientdm", > (PetscObject *)&dm_coeff)); puts a NULL pointer in dm_coeff and PETSc > dies when trying to get a named vector from that, but it works nicely > without fieldsplit. > > Is there some way to get fieldsplit to automagically "inherit" those > added parts or do I need to manually modify the DMs the fieldsplit is > using? I've been using KSPSetComputeOperators since it allows for > re-discretization without having to manage the levels myself, whereas > some more involved examples like src/dm/impls/stag/tutorials/ex4.c build > the matrices in advance when re-discretizing and set them with > KSPSetOperators, which would avoid the problem as well but also means > managing the levels. We don't have hooks to get your inner information automatically passed in from the outer DM but I think you can use PCFieldSplitGetSubKSP() after KSPSetUp() to get your two sub-KSPs, you then can set the "sub" DM to these etc to get them to "largely" behave as in your previous "uncoupled" code. Hopefully also use KSPSetOperators(). Barry > > > Any advice concerning solving my target Stokes-like equation is welcome > as well. I am coming from a explicit timestepping background so reading > up on saddle point problems and their efficient solution is all quite > new to me. > > > Best regards, > > Marco > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marco at kit.ac.jp Wed Mar 13 20:35:33 2024 From: marco at kit.ac.jp (Marco Seiz) Date: Thu, 14 Mar 2024 10:35:33 +0900 Subject: [petsc-users] Fieldsplit, multigrid and DM interaction In-Reply-To: <197C526B-AAFD-4AF9-BF21-BBCA65EEF861@petsc.dev> References: <197C526B-AAFD-4AF9-BF21-BBCA65EEF861@petsc.dev> Message-ID: <86a6127a-05d1-4fc6-a0b0-1f34b9eddc51@kit.ac.jp> An HTML attachment was scrubbed... URL: From y.hu at mpie.de Thu Mar 14 16:43:37 2024 From: y.hu at mpie.de (Yi Hu) Date: Thu, 14 Mar 2024 22:43:37 +0100 Subject: [petsc-users] snes matrix-free jacobian fails with preconditioner shape mismatch In-Reply-To: References: <1fbb5696-a7d4-4826-85dc-539082a6566f@mpie.de> <5AA62E2D-402F-4FEB-B04A-B90336DBC6F5@petsc.dev> <850170af-808c-4a68-8aad-4825b9944260@mpie.de> Message-ID: <205a2b9b-1cd9-4a81-b9e3-f10def1e8e8b@mpie.de> Dear Mark, Thanks for your response. I found the bug. I used wrong M N when MatCreateShell(). Now it works in both uniprocessor and mpirun cases. Best, Yi On 3/11/24 13:10, Mark Adams wrote: > I think you misunderstand?"global" vector. The global solution is > never on one processor. A global vector in this case seems to have 81 > global values and 27 local values. > It looks like you create a "global" vector that has 81 local values > that should never be created other?than for debigging. > GlobalToLocal refers to a "local" vector with ghost cells, so it would > have > 27 values in this case and you use it for local operations only > (its communicator is PETSC_COMM_SELF). > Hope this helps, > Mark > > On Mon, Mar 11, 2024 at 4:48?AM Yi Hu wrote: > > Dear Barry, > > Thanks for your response. Now I am doing simple debugging for my > customized mat_mult_op of my shell jacobian matrix. As far as I > understand, because the input and output of shell jacobian are all > global vectors, I need to do the global to local mapping of my > input vector (dF) by myself. Before starting debugging the > mapping, I first try to verify the size match of input and output. > > Because the input and output of a mat_mult_op of my shell matrix > should have the same size. So I tried just equating them in my > customized mat_mult_op. Basically like this > > subroutineGK_op(Jac,dF_global,output_global,err_PETSc) > > DM ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? :: dm_local ! Yi: later for is,ie > > Vec ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?:: dF_global > > Vec ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?:: output_global > > PetscErrorCode ? ? ? ? ? ? ? ? ? ? ? :: err_PETSc > > real(pREAL), pointer,dimension(:,:,:,:) ::dF_scal, output_scal > > callSNESGetDM(SNES_mech,dm_local,err_PETSc) > > CHKERRQ(err_PETSc) > > output_global =dF_global > > end subroutineGK_op > > When I run with mpirun -np 3, it gives me similar error like > previous, ?Preconditioner number of local rows 27 does not equal > input vector size 81?, (I changed my problem size so the numbers > are different). > > Maybe simple equating input and output is not valid (due to > ownership of different portion of a global vector). Then it may > give me different error message. In fact my global dF has size > 9*3*3*3, when running on 3 processors, the local dF has size > 9*3*3*1 (I split my domain in z direction). The error message > seems to suggest I am using a local dF rather than a global dF. > And the output and input vector sizes seems to be different. Do I > miss something here? > > Best regards, > > Yi > > *From:*Barry Smith > *Sent:* Sunday, March 10, 2024 6:45 PM > *To:* Yi Hu > *Cc:* Mark Adams ; petsc-users > > *Subject:* Re: [petsc-users] snes matrix-free jacobian fails with > preconditioner shape mismatch > > > > On Mar 10, 2024, at 10:16?AM, Yi Hu wrote: > > This Message Is From an External Sender > > This message came from outside your organization. > > Dear Mark, > > Thanks for your reply. I see this mismatch. In fact my global > DoF is 324. It seems like I always get the local size = global > Dof / np^2, np is my processor number. By the way, I used > DMDASNESsetFunctionLocal() to set my form function. Is it > eligible to mix DMDASNESsetFunctionLocal() and a native > SNESSetJacobian()? > > Yes > > > > Best, > > Yi > > On 3/10/24 13:55, Mark Adams wrote: > > It looks like your input vector is the global vector, size > 162, and the local matrix size is 81. > > Mark > > [1]PETSC ERROR: Preconditioner number of local rows 81 > does not equal > > input vector size 162 > > On Sun, Mar 10, 2024 at 7:21?AM Yi Hu wrote: > > *This Message Is From an External Sender* > > This message came from outside your organization. > > Dear petsc team, > > I implemented a matrix-free jacobian, and it can run > sequentially. But > > running parallel I got the pc error like this (running > with mpirun -np > > 2, only error from rank1 is presented here) > > [1]PETSC ERROR: --------------------- Error Message > > -------------------------------------------------------------- > > [1]PETSC ERROR: Nonconforming object sizes > > [1]PETSC ERROR: Preconditioner number of local rows 81 > does not equal > > input vector size 162 > > [1]PETSC ERROR: See > https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!ahmistzr4wD3TJ0OvI0JWxB9aVSIbP78Jcs2X_6KMb4LdoR8drLB_DkHvaguhrca22RgFer0PlyUrtdfCA$ > > for trouble shooting. > > [1]PETSC ERROR: Petsc Release Version 3.17.3, Jun 29, 2022 > > [1]PETSC ERROR: > /home/yi/workspace/DAMASK_yi/bin/DAMASK_grid on a > > arch-linux-c-opt named carbon-x1 by yi Sun Mar 10 > 12:01:46 2024 > > [1]PETSC ERROR: Configure options --download-fftw > --download-hdf5 > > --with-hdf5-fortran-bindings --download-fblaslapack > --download-chaco > > --download-hypre --download-metis --download-mumps > --download-parmetis > > --download-scalapack --download-suitesparse > --download-superlu > > --download-superlu_dist --download-triangle > --download-zlib > > --download-cmake --with-cxx-dialect=C++11 --with-c2html=0 > > --with-debugging=0 --with-ssl=0 --with-x=0 > COPTFLAGS=-O3 CXXOPTFLAGS=-O3 > > FOPTFLAGS=-O3 > > [1]PETSC ERROR: #1 PCApply() at > > /home/yi/App/petsc-3.17.3/src/ksp/pc/interface/precon.c:424 > > [1]PETSC ERROR: #2 KSP_PCApply() at > > /home/yi/App/petsc-3.17.3/include/petsc/private/kspimpl.h:376 > > [1]PETSC ERROR: #3 KSPInitialResidual() at > > /home/yi/App/petsc-3.17.3/src/ksp/ksp/interface/itres.c:64 > > [1]PETSC ERROR: #4 KSPSolve_GMRES() at > > /home/yi/App/petsc-3.17.3/src/ksp/ksp/impls/gmres/gmres.c:242 > > [1]PETSC ERROR: #5 KSPSolve_Private() at > > /home/yi/App/petsc-3.17.3/src/ksp/ksp/interface/itfunc.c:902 > > [1]PETSC ERROR: #6 KSPSolve() at > > /home/yi/App/petsc-3.17.3/src/ksp/ksp/interface/itfunc.c:1078 > > [1]PETSC ERROR: #7 SNESSolve_NEWTONLS() at > > /home/yi/App/petsc-3.17.3/src/snes/impls/ls/ls.c:222 > > [1]PETSC ERROR: #8 SNESSolve() at > > /home/yi/App/petsc-3.17.3/src/snes/interface/snes.c:4756 > > [1]PETSC ERROR: #9 User provided function() at User file:0 > > However, from snes matrix-free documentation > > (https://urldefense.us/v3/__https://petsc.org/release/manual/snes/*matrix-free-methods__;Iw!!G_uCfscf7eWS!ahmistzr4wD3TJ0OvI0JWxB9aVSIbP78Jcs2X_6KMb4LdoR8drLB_DkHvaguhrca22RgFer0Ply6ZbywOw$ > ), > it is said > > matrix-free is used with pcnone. So I assume it would > not apply > > preconditioner, but it did use preconditioning > probably the same as my > > matrix-free shell matrix. Here is how i initialize my > shell matrix and > > the corresponding customized multiplication. > > ? call > MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,& > > int(9*product(cells(1:2))*cells3,pPETSCINT),& > > int(9*product(cells(1:2))*cells3,pPETSCINT),& > > ????????????????????? F_PETSc,Jac_PETSc,err_PETSc) > > ? call > MatShellSetOperation(Jac_PETSc,MATOP_MULT,GK_op,err_PETSc) > > ? call SNESSetDM(SNES_mech,DM_mech,err_PETSc) > > ? call > > SNESSetJacobian(SNES_mech,Jac_PETSc,Jac_PETSc,PETSC_NULL_FUNCTION,0,err_PETSc) > > ? call SNESGetKSP(SNES_mech,ksp,err_PETSc) > > ? call PCSetType(pc,PCNONE,err_PETSc) > > And my GK_op is like > > subroutine GK_op(Jac,dF_global,output_local,err_PETSc) > > ? DM?????????????????????????????????? :: dm_local > > ? Vec????????????????????????????????? :: dF_global, > dF_local, output_local > > ? Mat????????????????????????????????? :: Jac > > ? PetscErrorCode?????????????????????? :: err_PETSc > > ? real(pREAL), pointer,dimension(:,:,:,:) :: dF_scal, > output_scal > > ? real(pREAL), dimension(3,3,cells(1),cells(2),cells3) > :: & > > ??? dF > > ? real(pREAL), dimension(3,3,cells(1),cells(2),cells3) > :: & > > ??? output > > ? call SNESGetDM(SNES_mech,dm_local,err_PETSc) > > ? call DMGetLocalVector(dm_local,dF_local,err_PETSc) > > ? call > > DMGlobalToLocalBegin(dm_local,dF_global,INSERT_VALUES,dF_local,err_PETSc) > > ? call > > DMGlobalToLocalEnd(dm_local,dF_global,INSERT_VALUES,dF_local,err_PETSc) > > ? call > DMDAVecGetArrayReadF90(dm_local,dF_local,dF_scal,err_PETSc) > > ? dF = reshape(dF_scal, [3,3,cells(1),cells(2),cells3]) > > ....... > > ? call > DMDAVecRestoreArrayF90(dm_local,output_local,output_scal,err_PETSc) > > ? CHKERRQ(err_PETSc) > > ? call > DMDAVecRestoreArrayF90(dm_local,dF_local,dF_scal,err_PETSc) > > ? CHKERRQ(err_PETSc) > > end subroutine GK_op > > I checked my cells3, it corresponds to my local size, > and it seems the > > local size of dF_local is ok. > > I am a bit lost here to find the reason for the > preconditioner bug. > > Could you help me on this? Thanks. > > Best regards, > > Yi > > ------------------------------------------------- > > Stay up to date and follow us on LinkedIn, Twitter and > YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > > Max-Planck-Stra?e 1 > > D-40237 D?sseldorf > > Handelsregister B 2533 > > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > > Prof. Dr. Gerhard Dehm > > Prof. Dr. J?rg Neugebauer > > Prof. Dr. Dierk Raabe > > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > > Steuernummer: 105 5891 1000 > > Please consider that invitations and e-mails of our > institute are > > only valid if they end with ?@mpie.de > . > > > If you are not sure of the validity please contact > rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu > Veranstaltungen und E-Mails > > aus unserem Haus nur mit der Endung ?@mpie.de > > g?ltig sind. > > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > > ------------------------------------------------- > > > > ------------------------------------------------------------------------ > > ------------------------------------------------- > Stay?up?to?date?and?follow?us?on?LinkedIn,?Twitter?and?YouTube. > > Max-Planck-Institut?f?r?Eisenforschung?GmbH > Max-Planck-Stra?e?1 > D-40237?D?sseldorf > > Handelsregister?B?2533 > Amtsgericht?D?sseldorf > > Gesch?ftsf?hrung > Prof.?Dr.?Gerhard?Dehm > Prof.?Dr.?J?rg?Neugebauer > Prof.?Dr.?Dierk?Raabe > Dr.?Kai?de?Weldige > > Ust.-Id.-Nr.:?DE?11?93?58?514 > Steuernummer:?105?5891?1000 > > > Please?consider?that?invitations?and?e-mails?of?our?institute?are > only?valid?if?they?end?with??@mpie.de . > If?you?are?not?sure?of?the?validity?please?contact rco at mpie.de > > > Bitte?beachten?Sie,?dass?Einladungen?zu?Veranstaltungen?und?E-Mails > aus?unserem?Haus?nur?mit?der?Endung??@mpie.de > ?g?ltig?sind. > In?Zweifelsf?llen?wenden?Sie?sich?bitte?an rco at mpie.de > > ------------------------------------------------- > > > > ------------------------------------------------------------------------ > ------------------------------------------------- > Stay?up?to?date?and?follow?us?on?LinkedIn,?Twitter?and?YouTube. > > Max-Planck-Institut?f?r?Eisenforschung?GmbH > Max-Planck-Stra?e?1 > D-40237?D?sseldorf > > Handelsregister?B?2533 > Amtsgericht?D?sseldorf > > Gesch?ftsf?hrung > Prof.?Dr.?Gerhard?Dehm > Prof.?Dr.?J?rg?Neugebauer > Prof.?Dr.?Dierk?Raabe > Dr.?Kai?de?Weldige > > Ust.-Id.-Nr.:?DE?11?93?58?514 > Steuernummer:?105?5891?1000 > > > Please?consider?that?invitations?and?e-mails?of?our?institute?are > only?valid?if?they?end?with??@mpie.de . > If?you?are?not?sure?of?the?validity?please?contact rco at mpie.de > > Bitte?beachten?Sie,?dass?Einladungen?zu?Veranstaltungen?und?E-Mails > aus?unserem?Haus?nur?mit?der?Endung??@mpie.de > ?g?ltig?sind. > In?Zweifelsf?llen?wenden?Sie?sich?bitte?an rco at mpie.de > ------------------------------------------------- > ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From bramkamp at nsc.liu.se Fri Mar 15 08:53:54 2024 From: bramkamp at nsc.liu.se (Frank Bramkamp) Date: Fri, 15 Mar 2024 14:53:54 +0100 Subject: [petsc-users] MATSETVALUES: Fortran problem Message-ID: An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Mar 15 09:07:39 2024 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 15 Mar 2024 10:07:39 -0400 Subject: [petsc-users] MATSETVALUES: Fortran problem In-Reply-To: References: Message-ID: > On Mar 15, 2024, at 9:53?AM, Frank Bramkamp wrote: > > This Message Is From an External Sender > This message came from outside your organization. > Dear PETSc Team, > > I am using the latest petsc version 3.20.5. > > > I would like to create a matrix using > MatCreateSeqAIJ > > To insert values, I use MatSetValues. > It seems that the Fortran interface/stubs are missing for MatsetValues, as the linker does not find any subroutine with that name. > MatSetValueLocal seems to be fine. Please send the exact error message (cut and paste), there are definitely Fortran stubs for this function but it could be you exact parameter input does not have a stub yet. Barry > > > Typically I am using a blocked matrix format (BAIJ), which works fine in fortran. > Soon we want to try PETSC on GPUs, using the format MATAIJCUSPARSE, since there seems not to be a blocked format available in PETSC for GPUs so far. > Therefore I first want to try the pointwise format MatCreateSeqAIJ format on a CPU, before using the GPU format. > > I think that CUDA also supports a block format now ?! Maybe that would be also useful to have one day. > > > Greetings, Frank Bramkamp > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Mar 15 09:25:01 2024 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 15 Mar 2024 10:25:01 -0400 Subject: [petsc-users] MATSETVALUES: Fortran problem In-Reply-To: References: Message-ID: On Fri, Mar 15, 2024 at 9:55?AM Frank Bramkamp wrote: > Dear PETSc Team, I am using the latest petsc version 3. 20. 5. I would > like to create a matrix using MatCreateSeqAIJ To insert values, I use > MatSetValues. It seems that the Fortran interface/stubs are missing for > MatsetValues, as the linker does > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Dear PETSc Team, > > I am using the latest petsc version 3.20.5. > > > I would like to create a matrix using > MatCreateSeqAIJ > > To insert values, I use MatSetValues. > It seems that the Fortran interface/stubs are missing for MatsetValues, as the linker does not find any subroutine with that name. > MatSetValueLocal seems to be fine. > > Here is a Fortran example calling MatSetValues(): https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tutorials/ex1f.F90?ref_type=heads__;!!G_uCfscf7eWS!ag7gfsg4qE_beGvp2_VPt5PyN1bZcFr8xkVlbukEpUnqYHx_awYDepWWZfT-7rtVF5-lHC2GXj1ETtSIaOLc$ > Typically I am using a blocked matrix format (BAIJ), which works fine in fortran. > Soon we want to try PETSC on GPUs, using the format MATAIJCUSPARSE, since there seems not to be a blocked format available in PETSC for GPUs so far. > > You can use the blocked input API, like MatSetValuesBlocked(), with all the storage formats. There is no explicit blocking in the MATAIJCUSPARSE format because Nvidia handles the optimization differently. Thanks, Matt Therefore I first want to try the pointwise format MatCreateSeqAIJ format on a CPU, before using the GPU format. > > I think that CUDA also supports a block format now ?! Maybe that would be also useful to have one day. > > > Greetings, Frank Bramkamp > > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ag7gfsg4qE_beGvp2_VPt5PyN1bZcFr8xkVlbukEpUnqYHx_awYDepWWZfT-7rtVF5-lHC2GXj1ETryTqsT3$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yangzongze at gmail.com Fri Mar 15 09:44:44 2024 From: yangzongze at gmail.com (Zongze Yang) Date: Fri, 15 Mar 2024 22:44:44 +0800 Subject: [petsc-users] Help Needed Debugging Installation Issue for PETSc with SLEPc Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: logs.tar.gz Type: application/x-gzip Size: 2646456 bytes Desc: not available URL: From pierre at joliv.et Fri Mar 15 09:50:44 2024 From: pierre at joliv.et (Pierre Jolivet) Date: Fri, 15 Mar 2024 15:50:44 +0100 Subject: [petsc-users] Help Needed Debugging Installation Issue for PETSc with SLEPc In-Reply-To: References: Message-ID: <2A66E419-B247-4283-9933-BCF09A0707CA@joliv.et> This was fixed 5 days ago in https://urldefense.us/v3/__https://gitlab.com/slepc/slepc/-/merge_requests/638__;!!G_uCfscf7eWS!Z6nHLo3qh8bAe4zTIkJHhBcLc2WyVfPb47nBF38kwu8pyoNFz11wVdiL2ytUg66oHoVPkjBafopf_Gag8j6dXQ$ , so you need to use an up-to-date release branch of SLEPc. Thanks, Pierre > On 15 Mar 2024, at 3:44?PM, Zongze Yang wrote: > > This Message Is From an External Sender > This message came from outside your organization. > Hi, > > I am currently facing an issue while attempting to install PETSc with SLEPc. Despite not encountering any errors in the log generated by the 'make' command, I am receiving an error message stating "Error during compile". > > I would greatly appreciate it if someone could provide me with some guidance on debugging this issue. > > I have attached the configure logs and make logs for both PETSc and SLEPc for your reference. > > Below is an excerpt from the make.log file of SLEPc: > ``` > CLINKER default/lib/libslepc.3.020.1.dylib > ld: warning: -commons use_dylibs is no longer supported, using error treatment instead > ld: warning: -commons use_dylibs is no longer supported, using error treatment instead > ld: warning: -commons use_dylibs is no longer supported, using error treatment instead > ld: warning: duplicate -rpath '/Users/zzyang/opt/firedrake/firedrake-real-int32-debug/src/petsc/default/lib' ignored > ld: warning: dylib (/opt/homebrew/Cellar/gcc/13.2.0/lib/gcc/current/libgfortran.dylib) was built for newer macOS version (14.0) than being linked (12.0) > ld: warning: dylib (/opt/homebrew/Cellar/gcc/13.2.0/lib/gcc/current/libquadmath.dylib) was built for newer macOS version (14.0) than being linked (12.0) > DSYMUTIL default/lib/libslepc.3.020.1.dylib > gmake[6]: Leaving directory '/Users/zzyang/opt/firedrake/firedrake-real-int32-debug/src/petsc/default/externalpackages/git.slepc' > gmake[5]: Leaving directory '/Users/zzyang/opt/firedrake/firedrake-real-int32-debug/src/petsc/default/externalpackages/git.slepc' > *******************************ERROR************************************ > Error during compile, check default/lib/slepc/conf/make.log > Send all contents of ./default/lib/slepc/conf to slepc-maint at upv.es > ************************************************************************ > Finishing make run at ?, 15 3 2024 21:04:17 +0800 > ``` > > Thank you very much for your attention and support. > > Best wishes, > Zongze > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yangzongze at gmail.com Fri Mar 15 09:56:09 2024 From: yangzongze at gmail.com (Zongze Yang) Date: Fri, 15 Mar 2024 22:56:09 +0800 Subject: [petsc-users] Help Needed Debugging Installation Issue for PETSc with SLEPc In-Reply-To: <2A66E419-B247-4283-9933-BCF09A0707CA@joliv.et> References: <2A66E419-B247-4283-9933-BCF09A0707CA@joliv.et> Message-ID: Thanks very much! Best wishes, Zongze > On 15 Mar 2024, at 22:50, Pierre Jolivet wrote: > > This was fixed 5 days ago in https://urldefense.us/v3/__https://gitlab.com/slepc/slepc/-/merge_requests/638__;!!G_uCfscf7eWS!fPbzaiH98Qdihq74E8kcX8JQE_EGk_PhMqYnheGPjwJD2l8AhgXg3ZrEkFZmV0lfG8vVlKawhAaQq4eSMRZVffeF$ , so you need to use an up-to-date release branch of SLEPc. > > Thanks, > Pierre > >> On 15 Mar 2024, at 3:44?PM, Zongze Yang wrote: >> >> This Message Is From an External Sender >> This message came from outside your organization. >> Hi, >> >> I am currently facing an issue while attempting to install PETSc with SLEPc. Despite not encountering any errors in the log generated by the 'make' command, I am receiving an error message stating "Error during compile". >> >> I would greatly appreciate it if someone could provide me with some guidance on debugging this issue. >> >> I have attached the configure logs and make logs for both PETSc and SLEPc for your reference. >> >> Below is an excerpt from the make.log file of SLEPc: >> ``` >> CLINKER default/lib/libslepc.3.020.1.dylib >> ld: warning: -commons use_dylibs is no longer supported, using error treatment instead >> ld: warning: -commons use_dylibs is no longer supported, using error treatment instead >> ld: warning: -commons use_dylibs is no longer supported, using error treatment instead >> ld: warning: duplicate -rpath '/Users/zzyang/opt/firedrake/firedrake-real-int32-debug/src/petsc/default/lib' ignored >> ld: warning: dylib (/opt/homebrew/Cellar/gcc/13.2.0/lib/gcc/current/libgfortran.dylib) was built for newer macOS version (14.0) than being linked (12.0) >> ld: warning: dylib (/opt/homebrew/Cellar/gcc/13.2.0/lib/gcc/current/libquadmath.dylib) was built for newer macOS version (14.0) than being linked (12.0) >> DSYMUTIL default/lib/libslepc.3.020.1.dylib >> gmake[6]: Leaving directory '/Users/zzyang/opt/firedrake/firedrake-real-int32-debug/src/petsc/default/externalpackages/git.slepc' >> gmake[5]: Leaving directory '/Users/zzyang/opt/firedrake/firedrake-real-int32-debug/src/petsc/default/externalpackages/git.slepc' >> *******************************ERROR************************************ >> Error during compile, check default/lib/slepc/conf/make.log >> Send all contents of ./default/lib/slepc/conf to slepc-maint at upv.es >> ************************************************************************ >> Finishing make run at ?, 15 3 2024 21:04:17 +0800 >> ``` >> >> Thank you very much for your attention and support. >> >> Best wishes, >> Zongze >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yangzongze at gmail.com Sun Mar 17 05:29:38 2024 From: yangzongze at gmail.com (Zongze Yang) Date: Sun, 17 Mar 2024 18:29:38 +0800 Subject: [petsc-users] Install PETSc with option `--with-shared-libraries=1` failed on MacOS Message-ID: <93DAC71E-10F8-48FC-A2E1-DA90F64129DB@gmail.com> An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log.tar.gz Type: application/x-gzip Size: 868937 bytes Desc: not available URL: -------------- next part -------------- From pierre at joliv.et Sun Mar 17 05:48:13 2024 From: pierre at joliv.et (Pierre Jolivet) Date: Sun, 17 Mar 2024 11:48:13 +0100 Subject: [petsc-users] Install PETSc with option `--with-shared-libraries=1` failed on MacOS In-Reply-To: <93DAC71E-10F8-48FC-A2E1-DA90F64129DB@gmail.com> References: <93DAC71E-10F8-48FC-A2E1-DA90F64129DB@gmail.com> Message-ID: <333E0143-AFCA-41F6-BD72-EBCF88C6BFBD@joliv.et> You need this MR https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7365__;!!G_uCfscf7eWS!ZkCTTWYisvuRPh4YfLXoXTv2cY0NTXiSQhpIXFilhL3cIcymTpMi-q4OnEYp5mh85b9wUGFXUa55vdINNmkvfQ$ main has been broken for macOS since https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7341__;!!G_uCfscf7eWS!ZkCTTWYisvuRPh4YfLXoXTv2cY0NTXiSQhpIXFilhL3cIcymTpMi-q4OnEYp5mh85b9wUGFXUa55vdKnE_DalQ$ , so the alternative is to revert to the commit prior. It should work either way. Thanks, Pierre > On 17 Mar 2024, at 11:31?AM, Zongze Yang wrote: > > ? > This Message Is From an External Sender > This message came from outside your organization. > Hi, PETSc Team, > > I am trying to install petsc with the following configuration > ``` > ./configure \ > --download-bison \ > --download-mpich \ > --download-mpich-configure-arguments=--disable-opencl \ > --download-hwloc \ > --download-hwloc-configure-arguments=--disable-opencl \ > --download-openblas \ > --download-openblas-make-options="'USE_THREAD=0 USE_LOCKING=1 USE_OPENMP=0'" \ > --with-shared-libraries=1 \ > --with-fortran-bindings=0 \ > --with-zlib \ > LDFLAGS=-Wl,-ld_classic > ``` > > The log shows that > ``` > Exhausted all shared linker guesses. Could not determine how to create a shared library! > ``` > > I recently updated the system and Xcode, as well as homebrew. > > The configure.log is attached. > > Thanks for your attention to this matter. > > Best wishes, > Zongze > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yangzongze at gmail.com Sun Mar 17 07:04:21 2024 From: yangzongze at gmail.com (Zongze Yang) Date: Sun, 17 Mar 2024 20:04:21 +0800 Subject: [petsc-users] Install PETSc with option `--with-shared-libraries=1` failed on MacOS In-Reply-To: <333E0143-AFCA-41F6-BD72-EBCF88C6BFBD@joliv.et> References: <93DAC71E-10F8-48FC-A2E1-DA90F64129DB@gmail.com> <333E0143-AFCA-41F6-BD72-EBCF88C6BFBD@joliv.et> Message-ID: <7C476492-F3A6-417B-9A71-61129CC00BB2@gmail.com> Thank you for providing the instructions. I try the first option. Now, the error of the configuration is related to OpenBLAS. Add `--CFLAGS=-Wno-int-conversion` to configure command resolve this. Should this be reported to OpenBLAS? Or need to fix the configure in petsc? The configure.log is attached. The errors are show below: ``` src/lapack_wrappers.c:570:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); ^~~~ & src/../inc/relapack.h:74:216: note: passing argument to parameter here void RELAPACK_sgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); ^ src/lapack_wrappers.c:583:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] RELAPACK_dgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); ^~~~ & src/../inc/relapack.h:75:221: note: passing argument to parameter here void RELAPACK_dgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); ^ src/lapack_wrappers.c:596:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] RELAPACK_cgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); ^~~~ & src/../inc/relapack.h:76:216: note: passing argument to parameter here void RELAPACK_cgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); ^ src/lapack_wrappers.c:609:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] RELAPACK_zgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); ^~~~ & src/../inc/relapack.h:77:221: note: passing argument to parameter here void RELAPACK_zgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); ^ 4 errors generated. ``` Best wishes, Zongze ? > On 17 Mar 2024, at 18:48, Pierre Jolivet wrote: > > You need this MR https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7365__;!!G_uCfscf7eWS!bUnys903wnuB3NvLnoQEH8iD_BqwKe6SWm2uXtRSRIFro2s8QswVwytOJP_iJvq_VnGlbXXsz234SXVcQ8zhJ-WC$ > main has been broken for macOS since https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7341__;!!G_uCfscf7eWS!bUnys903wnuB3NvLnoQEH8iD_BqwKe6SWm2uXtRSRIFro2s8QswVwytOJP_iJvq_VnGlbXXsz234SXVcQ2Q2MLSV$ , so the alternative is to revert to the commit prior. > It should work either way. > > Thanks, > Pierre > >> On 17 Mar 2024, at 11:31?AM, Zongze Yang > wrote: >> >> ? >> This Message Is From an External Sender >> This message came from outside your organization. >> Hi, PETSc Team, >> >> I am trying to install petsc with the following configuration >> ``` >> ./configure \ >> --download-bison \ >> --download-mpich \ >> --download-mpich-configure-arguments=--disable-opencl \ >> --download-hwloc \ >> --download-hwloc-configure-arguments=--disable-opencl \ >> --download-openblas \ >> --download-openblas-make-options="'USE_THREAD=0 USE_LOCKING=1 USE_OPENMP=0'" \ >> --with-shared-libraries=1 \ >> --with-fortran-bindings=0 \ >> --with-zlib \ >> LDFLAGS=-Wl,-ld_classic >> ``` >> >> The log shows that >> ``` >> Exhausted all shared linker guesses. Could not determine how to create a shared library! >> ``` >> >> I recently updated the system and Xcode, as well as homebrew. >> >> The configure.log is attached. >> >> Thanks for your attention to this matter. >> >> Best wishes, >> Zongze >> -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log.tar.gz Type: application/x-gzip Size: 1196755 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Sun Mar 17 07:58:58 2024 From: pierre at joliv.et (Pierre Jolivet) Date: Sun, 17 Mar 2024 13:58:58 +0100 Subject: [petsc-users] Install PETSc with option `--with-shared-libraries=1` failed on MacOS In-Reply-To: <7C476492-F3A6-417B-9A71-61129CC00BB2@gmail.com> References: <93DAC71E-10F8-48FC-A2E1-DA90F64129DB@gmail.com> <333E0143-AFCA-41F6-BD72-EBCF88C6BFBD@joliv.et> <7C476492-F3A6-417B-9A71-61129CC00BB2@gmail.com> Message-ID: <4713306B-F97E-4853-9E86-20BA4C818810@joliv.et> > On 17 Mar 2024, at 1:04?PM, Zongze Yang wrote: > > Thank you for providing the instructions. I try the first option. > > Now, the error of the configuration is related to OpenBLAS. > Add `--CFLAGS=-Wno-int-conversion` to configure command resolve this. Should this be reported to OpenBLAS? Or need to fix the configure in petsc? I see our linux-opt-arm runner is using the additional flag '--download-openblas-make-options=TARGET=GENERIC', could you maybe try to add that as well? I don?t think there is much to fix on our end, OpenBLAS has been very broken lately on arm (current version is 0.3.26 but we can?t update because there is a huge performance regression which makes the pipeline timeout). Thanks, Pierre > > The configure.log is attached. The errors are show below: > ``` > src/lapack_wrappers.c:570:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > ^~~~ > & > src/../inc/relapack.h:74:216: note: passing argument to parameter here > void RELAPACK_sgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); > ^ > src/lapack_wrappers.c:583:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > RELAPACK_dgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > ^~~~ > & > src/../inc/relapack.h:75:221: note: passing argument to parameter here > void RELAPACK_dgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); > ^ > src/lapack_wrappers.c:596:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > RELAPACK_cgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > ^~~~ > & > src/../inc/relapack.h:76:216: note: passing argument to parameter here > void RELAPACK_cgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); > ^ > src/lapack_wrappers.c:609:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > RELAPACK_zgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > ^~~~ > & > src/../inc/relapack.h:77:221: note: passing argument to parameter here > void RELAPACK_zgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); > ^ > 4 errors generated. > ``` > > Best wishes, > Zongze > > > >> On 17 Mar 2024, at 18:48, Pierre Jolivet wrote: >> >> You need this MR https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7365__;!!G_uCfscf7eWS!ZvwjMAU-lcsnHOD1DYr8ZDVvdFJXa8hvGZBO4pBGgAQUue7mdUWP4lTTAroHEIUV3yEADf1DJ7z-etn64PkQMw$ >> main has been broken for macOS since https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7341__;!!G_uCfscf7eWS!ZvwjMAU-lcsnHOD1DYr8ZDVvdFJXa8hvGZBO4pBGgAQUue7mdUWP4lTTAroHEIUV3yEADf1DJ7z-etlzwFFxow$ , so the alternative is to revert to the commit prior. >> It should work either way. >> >> Thanks, >> Pierre >> >>> On 17 Mar 2024, at 11:31?AM, Zongze Yang > wrote: >>> >>> ? >>> This Message Is From an External Sender >>> This message came from outside your organization. >>> Hi, PETSc Team, >>> >>> I am trying to install petsc with the following configuration >>> ``` >>> ./configure \ >>> --download-bison \ >>> --download-mpich \ >>> --download-mpich-configure-arguments=--disable-opencl \ >>> --download-hwloc \ >>> --download-hwloc-configure-arguments=--disable-opencl \ >>> --download-openblas \ >>> --download-openblas-make-options="'USE_THREAD=0 USE_LOCKING=1 USE_OPENMP=0'" \ >>> --with-shared-libraries=1 \ >>> --with-fortran-bindings=0 \ >>> --with-zlib \ >>> LDFLAGS=-Wl,-ld_classic >>> ``` >>> >>> The log shows that >>> ``` >>> Exhausted all shared linker guesses. Could not determine how to create a shared library! >>> ``` >>> >>> I recently updated the system and Xcode, as well as homebrew. >>> >>> The configure.log is attached. >>> >>> Thanks for your attention to this matter. >>> >>> Best wishes, >>> Zongze >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yangzongze at gmail.com Sun Mar 17 08:58:01 2024 From: yangzongze at gmail.com (Zongze Yang) Date: Sun, 17 Mar 2024 21:58:01 +0800 Subject: [petsc-users] Install PETSc with option `--with-shared-libraries=1` failed on MacOS In-Reply-To: <4713306B-F97E-4853-9E86-20BA4C818810@joliv.et> References: <93DAC71E-10F8-48FC-A2E1-DA90F64129DB@gmail.com> <333E0143-AFCA-41F6-BD72-EBCF88C6BFBD@joliv.et> <7C476492-F3A6-417B-9A71-61129CC00BB2@gmail.com> <4713306B-F97E-4853-9E86-20BA4C818810@joliv.et> Message-ID: <8178ED7E-487A-432C-9D9B-08A7707CCF87@gmail.com> Adding the flag `--download-openblas-make-options=TARGET=GENERIC` did not resolve the issue. The same error persisted. Best wishes, Zongze > On 17 Mar 2024, at 20:58, Pierre Jolivet wrote: > > > >> On 17 Mar 2024, at 1:04?PM, Zongze Yang wrote: >> >> Thank you for providing the instructions. I try the first option. >> >> Now, the error of the configuration is related to OpenBLAS. >> Add `--CFLAGS=-Wno-int-conversion` to configure command resolve this. Should this be reported to OpenBLAS? Or need to fix the configure in petsc? > > I see our linux-opt-arm runner is using the additional flag '--download-openblas-make-options=TARGET=GENERIC', could you maybe try to add that as well? > I don?t think there is much to fix on our end, OpenBLAS has been very broken lately on arm (current version is 0.3.26 but we can?t update because there is a huge performance regression which makes the pipeline timeout). > > Thanks, > Pierre > >> >> The configure.log is attached. The errors are show below: >> ``` >> src/lapack_wrappers.c:570:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >> RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >> ^~~~ >> & >> src/../inc/relapack.h:74:216: note: passing argument to parameter here >> void RELAPACK_sgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); >> ^ >> src/lapack_wrappers.c:583:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >> RELAPACK_dgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >> ^~~~ >> & >> src/../inc/relapack.h:75:221: note: passing argument to parameter here >> void RELAPACK_dgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); >> ^ >> src/lapack_wrappers.c:596:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >> RELAPACK_cgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >> ^~~~ >> & >> src/../inc/relapack.h:76:216: note: passing argument to parameter here >> void RELAPACK_cgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); >> ^ >> src/lapack_wrappers.c:609:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >> RELAPACK_zgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >> ^~~~ >> & >> src/../inc/relapack.h:77:221: note: passing argument to parameter here >> void RELAPACK_zgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); >> ^ >> 4 errors generated. >> ``` >> >> Best wishes, >> Zongze >> >> >> >>> On 17 Mar 2024, at 18:48, Pierre Jolivet wrote: >>> >>> You need this MR https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7365__;!!G_uCfscf7eWS!eCQRfbol7FDQiO0o78iDit2saij_ydIUtCfRQnsQAt-h_YcXr2Yi2BFnFnqHZp0FO3Lhpyr2RKdHZ-T-OF94HpwQ$ >>> main has been broken for macOS since https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7341__;!!G_uCfscf7eWS!eCQRfbol7FDQiO0o78iDit2saij_ydIUtCfRQnsQAt-h_YcXr2Yi2BFnFnqHZp0FO3Lhpyr2RKdHZ-T-OIhlJwLx$ , so the alternative is to revert to the commit prior. >>> It should work either way. >>> >>> Thanks, >>> Pierre >>> >>>> On 17 Mar 2024, at 11:31?AM, Zongze Yang > wrote: >>>> >>>> ? >>>> This Message Is From an External Sender >>>> This message came from outside your organization. >>>> Hi, PETSc Team, >>>> >>>> I am trying to install petsc with the following configuration >>>> ``` >>>> ./configure \ >>>> --download-bison \ >>>> --download-mpich \ >>>> --download-mpich-configure-arguments=--disable-opencl \ >>>> --download-hwloc \ >>>> --download-hwloc-configure-arguments=--disable-opencl \ >>>> --download-openblas \ >>>> --download-openblas-make-options="'USE_THREAD=0 USE_LOCKING=1 USE_OPENMP=0'" \ >>>> --with-shared-libraries=1 \ >>>> --with-fortran-bindings=0 \ >>>> --with-zlib \ >>>> LDFLAGS=-Wl,-ld_classic >>>> ``` >>>> >>>> The log shows that >>>> ``` >>>> Exhausted all shared linker guesses. Could not determine how to create a shared library! >>>> ``` >>>> >>>> I recently updated the system and Xcode, as well as homebrew. >>>> >>>> The configure.log is attached. >>>> >>>> Thanks for your attention to this matter. >>>> >>>> Best wishes, >>>> Zongze >>>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Sun Mar 17 09:04:16 2024 From: bsmith at petsc.dev (Barry Smith) Date: Sun, 17 Mar 2024 10:04:16 -0400 Subject: [petsc-users] Install PETSc with option `--with-shared-libraries=1` failed on MacOS In-Reply-To: <8178ED7E-487A-432C-9D9B-08A7707CCF87@gmail.com> References: <93DAC71E-10F8-48FC-A2E1-DA90F64129DB@gmail.com> <333E0143-AFCA-41F6-BD72-EBCF88C6BFBD@joliv.et> <7C476492-F3A6-417B-9A71-61129CC00BB2@gmail.com> <4713306B-F97E-4853-9E86-20BA4C818810@joliv.et> <8178ED7E-487A-432C-9D9B-08A7707CCF87@gmail.com> Message-ID: <0A79AA1A-F2D6-4415-B668-CB7323B3B3F8@petsc.dev> I would just avoid the --download-openblas option. The BLAS/LAPACK provided by Apple should perform fine, perhaps even better than OpenBLAS on your system. > On Mar 17, 2024, at 9:58?AM, Zongze Yang wrote: > > This Message Is From an External Sender > This message came from outside your organization. > Adding the flag `--download-openblas-make-options=TARGET=GENERIC` did not resolve the issue. The same error persisted. > > Best wishes, > Zongze > >> On 17 Mar 2024, at 20:58, Pierre Jolivet > wrote: >> >> >> >>> On 17 Mar 2024, at 1:04?PM, Zongze Yang > wrote: >>> >>> Thank you for providing the instructions. I try the first option. >>> >>> Now, the error of the configuration is related to OpenBLAS. >>> Add `--CFLAGS=-Wno-int-conversion` to configure command resolve this. Should this be reported to OpenBLAS? Or need to fix the configure in petsc? >> >> I see our linux-opt-arm runner is using the additional flag '--download-openblas-make-options=TARGET=GENERIC', could you maybe try to add that as well? >> I don?t think there is much to fix on our end, OpenBLAS has been very broken lately on arm (current version is 0.3.26 but we can?t update because there is a huge performance regression which makes the pipeline timeout). >> >> Thanks, >> Pierre >> >>> >>> The configure.log is attached. The errors are show below: >>> ``` >>> src/lapack_wrappers.c:570:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>> RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>> ^~~~ >>> & >>> src/../inc/relapack.h:74:216: note: passing argument to parameter here >>> void RELAPACK_sgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); >>> ^ >>> src/lapack_wrappers.c:583:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>> RELAPACK_dgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>> ^~~~ >>> & >>> src/../inc/relapack.h:75:221: note: passing argument to parameter here >>> void RELAPACK_dgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); >>> ^ >>> src/lapack_wrappers.c:596:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>> RELAPACK_cgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>> ^~~~ >>> & >>> src/../inc/relapack.h:76:216: note: passing argument to parameter here >>> void RELAPACK_cgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); >>> ^ >>> src/lapack_wrappers.c:609:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>> RELAPACK_zgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>> ^~~~ >>> & >>> src/../inc/relapack.h:77:221: note: passing argument to parameter here >>> void RELAPACK_zgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); >>> ^ >>> 4 errors generated. >>> ``` >>> >>> Best wishes, >>> Zongze >>> >>> >>> >>>> On 17 Mar 2024, at 18:48, Pierre Jolivet > wrote: >>>> >>>> You need this MR https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7365__;!!G_uCfscf7eWS!eOGIcMBe3EV8va7MfOwaPaSEEZWB0R4RNvzMgW8PRyySg08UkUdgsHoooFig3Oai-akPT9jYBu9bh5e2Mgqi_ao$ >>>> main has been broken for macOS since https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7341__;!!G_uCfscf7eWS!eOGIcMBe3EV8va7MfOwaPaSEEZWB0R4RNvzMgW8PRyySg08UkUdgsHoooFig3Oai-akPT9jYBu9bh5e2sXmUpjA$ , so the alternative is to revert to the commit prior. >>>> It should work either way. >>>> >>>> Thanks, >>>> Pierre >>>> >>>>> On 17 Mar 2024, at 11:31?AM, Zongze Yang > wrote: >>>>> >>>>> ? >>>>> This Message Is From an External Sender >>>>> This message came from outside your organization. >>>>> Hi, PETSc Team, >>>>> >>>>> I am trying to install petsc with the following configuration >>>>> ``` >>>>> ./configure \ >>>>> --download-bison \ >>>>> --download-mpich \ >>>>> --download-mpich-configure-arguments=--disable-opencl \ >>>>> --download-hwloc \ >>>>> --download-hwloc-configure-arguments=--disable-opencl \ >>>>> --download-openblas \ >>>>> --download-openblas-make-options="'USE_THREAD=0 USE_LOCKING=1 USE_OPENMP=0'" \ >>>>> --with-shared-libraries=1 \ >>>>> --with-fortran-bindings=0 \ >>>>> --with-zlib \ >>>>> LDFLAGS=-Wl,-ld_classic >>>>> ``` >>>>> >>>>> The log shows that >>>>> ``` >>>>> Exhausted all shared linker guesses. Could not determine how to create a shared library! >>>>> ``` >>>>> >>>>> I recently updated the system and Xcode, as well as homebrew. >>>>> >>>>> The configure.log is attached. >>>>> >>>>> Thanks for your attention to this matter. >>>>> >>>>> Best wishes, >>>>> Zongze >>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From yangzongze at gmail.com Sun Mar 17 09:06:48 2024 From: yangzongze at gmail.com (Zongze Yang) Date: Sun, 17 Mar 2024 22:06:48 +0800 Subject: [petsc-users] Install PETSc with option `--with-shared-libraries=1` failed on MacOS In-Reply-To: <0A79AA1A-F2D6-4415-B668-CB7323B3B3F8@petsc.dev> References: <93DAC71E-10F8-48FC-A2E1-DA90F64129DB@gmail.com> <333E0143-AFCA-41F6-BD72-EBCF88C6BFBD@joliv.et> <7C476492-F3A6-417B-9A71-61129CC00BB2@gmail.com> <4713306B-F97E-4853-9E86-20BA4C818810@joliv.et> <8178ED7E-487A-432C-9D9B-08A7707CCF87@gmail.com> <0A79AA1A-F2D6-4415-B668-CB7323B3B3F8@petsc.dev> Message-ID: Understood. Thank you for your advice. Best wishes, Zongze > On 17 Mar 2024, at 22:04, Barry Smith wrote: > > > I would just avoid the --download-openblas option. The BLAS/LAPACK provided by Apple should perform fine, perhaps even better than OpenBLAS on your system. > > >> On Mar 17, 2024, at 9:58?AM, Zongze Yang wrote: >> >> This Message Is From an External Sender >> This message came from outside your organization. >> Adding the flag `--download-openblas-make-options=TARGET=GENERIC` did not resolve the issue. The same error persisted. >> >> Best wishes, >> Zongze >> >>> On 17 Mar 2024, at 20:58, Pierre Jolivet > wrote: >>> >>> >>> >>>> On 17 Mar 2024, at 1:04?PM, Zongze Yang > wrote: >>>> >>>> Thank you for providing the instructions. I try the first option. >>>> >>>> Now, the error of the configuration is related to OpenBLAS. >>>> Add `--CFLAGS=-Wno-int-conversion` to configure command resolve this. Should this be reported to OpenBLAS? Or need to fix the configure in petsc? >>> >>> I see our linux-opt-arm runner is using the additional flag '--download-openblas-make-options=TARGET=GENERIC', could you maybe try to add that as well? >>> I don?t think there is much to fix on our end, OpenBLAS has been very broken lately on arm (current version is 0.3.26 but we can?t update because there is a huge performance regression which makes the pipeline timeout). >>> >>> Thanks, >>> Pierre >>> >>>> >>>> The configure.log is attached. The errors are show below: >>>> ``` >>>> src/lapack_wrappers.c:570:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>> RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>> ^~~~ >>>> & >>>> src/../inc/relapack.h:74:216: note: passing argument to parameter here >>>> void RELAPACK_sgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); >>>> ^ >>>> src/lapack_wrappers.c:583:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>> RELAPACK_dgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>> ^~~~ >>>> & >>>> src/../inc/relapack.h:75:221: note: passing argument to parameter here >>>> void RELAPACK_dgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); >>>> ^ >>>> src/lapack_wrappers.c:596:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>> RELAPACK_cgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>> ^~~~ >>>> & >>>> src/../inc/relapack.h:76:216: note: passing argument to parameter here >>>> void RELAPACK_cgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); >>>> ^ >>>> src/lapack_wrappers.c:609:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>> RELAPACK_zgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>> ^~~~ >>>> & >>>> src/../inc/relapack.h:77:221: note: passing argument to parameter here >>>> void RELAPACK_zgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); >>>> ^ >>>> 4 errors generated. >>>> ``` >>>> >>>> Best wishes, >>>> Zongze >>>> >>>> >>>> >>>>> On 17 Mar 2024, at 18:48, Pierre Jolivet > wrote: >>>>> >>>>> You need this MR https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7365__;!!G_uCfscf7eWS!bg2LIKBxOm1exSNScy6-lmAt_UvmhaKaLt_8vup8lDi5cObA-l03LzqWnOPP66bCqEr3RCS9zwx63Bsz7PbEjhhM$ >>>>> main has been broken for macOS since https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7341__;!!G_uCfscf7eWS!bg2LIKBxOm1exSNScy6-lmAt_UvmhaKaLt_8vup8lDi5cObA-l03LzqWnOPP66bCqEr3RCS9zwx63Bsz7M0XuMO3$ , so the alternative is to revert to the commit prior. >>>>> It should work either way. >>>>> >>>>> Thanks, >>>>> Pierre >>>>> >>>>>> On 17 Mar 2024, at 11:31?AM, Zongze Yang > wrote: >>>>>> >>>>>> ? >>>>>> This Message Is From an External Sender >>>>>> This message came from outside your organization. >>>>>> Hi, PETSc Team, >>>>>> >>>>>> I am trying to install petsc with the following configuration >>>>>> ``` >>>>>> ./configure \ >>>>>> --download-bison \ >>>>>> --download-mpich \ >>>>>> --download-mpich-configure-arguments=--disable-opencl \ >>>>>> --download-hwloc \ >>>>>> --download-hwloc-configure-arguments=--disable-opencl \ >>>>>> --download-openblas \ >>>>>> --download-openblas-make-options="'USE_THREAD=0 USE_LOCKING=1 USE_OPENMP=0'" \ >>>>>> --with-shared-libraries=1 \ >>>>>> --with-fortran-bindings=0 \ >>>>>> --with-zlib \ >>>>>> LDFLAGS=-Wl,-ld_classic >>>>>> ``` >>>>>> >>>>>> The log shows that >>>>>> ``` >>>>>> Exhausted all shared linker guesses. Could not determine how to create a shared library! >>>>>> ``` >>>>>> >>>>>> I recently updated the system and Xcode, as well as homebrew. >>>>>> >>>>>> The configure.log is attached. >>>>>> >>>>>> Thanks for your attention to this matter. >>>>>> >>>>>> Best wishes, >>>>>> Zongze >>>>>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Sun Mar 17 09:23:23 2024 From: pierre at joliv.et (Pierre Jolivet) Date: Sun, 17 Mar 2024 15:23:23 +0100 Subject: [petsc-users] Install PETSc with option `--with-shared-libraries=1` failed on MacOS In-Reply-To: References: <93DAC71E-10F8-48FC-A2E1-DA90F64129DB@gmail.com> <333E0143-AFCA-41F6-BD72-EBCF88C6BFBD@joliv.et> <7C476492-F3A6-417B-9A71-61129CC00BB2@gmail.com> <4713306B-F97E-4853-9E86-20BA4C818810@joliv.et> <8178ED7E-487A-432C-9D9B-08A7707CCF87@gmail.com> <0A79AA1A-F2D6-4415-B668-CB7323B3B3F8@petsc.dev> Message-ID: Ah, my bad, I misread linux-opt-arm as a macOS runner, no wonder the option is not helping? Take Barry?s advice. Furthermore, it looks like OpenBLAS people are steering in the opposite direction as us, by forcing the use of ld-classic https://urldefense.us/v3/__https://github.com/OpenMathLib/OpenBLAS/commit/103d6f4e42fbe532ae4ea48e8d90d7d792bc93d2__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SrazFoooQ$ , so that?s another good argument in favor of -framework Accelerate. Thanks, Pierre PS: anyone benchmarked those https://urldefense.us/v3/__https://developer.apple.com/documentation/accelerate/sparse_solvers__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SrpnDvT5g$ ? I didn?t even know they existed. > On 17 Mar 2024, at 3:06?PM, Zongze Yang wrote: > > This Message Is From an External Sender > This message came from outside your organization. > Understood. Thank you for your advice. > > Best wishes, > Zongze > >> On 17 Mar 2024, at 22:04, Barry Smith > wrote: >> >> >> I would just avoid the --download-openblas option. The BLAS/LAPACK provided by Apple should perform fine, perhaps even better than OpenBLAS on your system. >> >> >>> On Mar 17, 2024, at 9:58?AM, Zongze Yang > wrote: >>> >>> This Message Is From an External Sender >>> This message came from outside your organization. >>> Adding the flag `--download-openblas-make-options=TARGET=GENERIC` did not resolve the issue. The same error persisted. >>> >>> Best wishes, >>> Zongze >>> >>>> On 17 Mar 2024, at 20:58, Pierre Jolivet > wrote: >>>> >>>> >>>> >>>>> On 17 Mar 2024, at 1:04?PM, Zongze Yang > wrote: >>>>> >>>>> Thank you for providing the instructions. I try the first option. >>>>> >>>>> Now, the error of the configuration is related to OpenBLAS. >>>>> Add `--CFLAGS=-Wno-int-conversion` to configure command resolve this. Should this be reported to OpenBLAS? Or need to fix the configure in petsc? >>>> >>>> I see our linux-opt-arm runner is using the additional flag '--download-openblas-make-options=TARGET=GENERIC', could you maybe try to add that as well? >>>> I don?t think there is much to fix on our end, OpenBLAS has been very broken lately on arm (current version is 0.3.26 but we can?t update because there is a huge performance regression which makes the pipeline timeout). >>>> >>>> Thanks, >>>> Pierre >>>> >>>>> >>>>> The configure.log is attached. The errors are show below: >>>>> ``` >>>>> src/lapack_wrappers.c:570:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>>> RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>>> ^~~~ >>>>> & >>>>> src/../inc/relapack.h:74:216: note: passing argument to parameter here >>>>> void RELAPACK_sgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); >>>>> ^ >>>>> src/lapack_wrappers.c:583:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>>> RELAPACK_dgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>>> ^~~~ >>>>> & >>>>> src/../inc/relapack.h:75:221: note: passing argument to parameter here >>>>> void RELAPACK_dgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); >>>>> ^ >>>>> src/lapack_wrappers.c:596:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>>> RELAPACK_cgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>>> ^~~~ >>>>> & >>>>> src/../inc/relapack.h:76:216: note: passing argument to parameter here >>>>> void RELAPACK_cgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); >>>>> ^ >>>>> src/lapack_wrappers.c:609:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>>> RELAPACK_zgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>>> ^~~~ >>>>> & >>>>> src/../inc/relapack.h:77:221: note: passing argument to parameter here >>>>> void RELAPACK_zgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); >>>>> ^ >>>>> 4 errors generated. >>>>> ``` >>>>> >>>>> Best wishes, >>>>> Zongze >>>>> >>>>> >>>>> >>>>>> On 17 Mar 2024, at 18:48, Pierre Jolivet > wrote: >>>>>> >>>>>> You need this MR https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7365__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SqG8HOUGQ$ >>>>>> main has been broken for macOS since https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7341__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9Soe8Kh_uQ$ , so the alternative is to revert to the commit prior. >>>>>> It should work either way. >>>>>> >>>>>> Thanks, >>>>>> Pierre >>>>>> >>>>>>> On 17 Mar 2024, at 11:31?AM, Zongze Yang > wrote: >>>>>>> >>>>>>> ? >>>>>>> This Message Is From an External Sender >>>>>>> This message came from outside your organization. >>>>>>> Hi, PETSc Team, >>>>>>> >>>>>>> I am trying to install petsc with the following configuration >>>>>>> ``` >>>>>>> ./configure \ >>>>>>> --download-bison \ >>>>>>> --download-mpich \ >>>>>>> --download-mpich-configure-arguments=--disable-opencl \ >>>>>>> --download-hwloc \ >>>>>>> --download-hwloc-configure-arguments=--disable-opencl \ >>>>>>> --download-openblas \ >>>>>>> --download-openblas-make-options="'USE_THREAD=0 USE_LOCKING=1 USE_OPENMP=0'" \ >>>>>>> --with-shared-libraries=1 \ >>>>>>> --with-fortran-bindings=0 \ >>>>>>> --with-zlib \ >>>>>>> LDFLAGS=-Wl,-ld_classic >>>>>>> ``` >>>>>>> >>>>>>> The log shows that >>>>>>> ``` >>>>>>> Exhausted all shared linker guesses. Could not determine how to create a shared library! >>>>>>> ``` >>>>>>> >>>>>>> I recently updated the system and Xcode, as well as homebrew. >>>>>>> >>>>>>> The configure.log is attached. >>>>>>> >>>>>>> Thanks for your attention to this matter. >>>>>>> >>>>>>> Best wishes, >>>>>>> Zongze >>>>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From yangzongze at gmail.com Sun Mar 17 09:50:07 2024 From: yangzongze at gmail.com (Zongze Yang) Date: Sun, 17 Mar 2024 22:50:07 +0800 Subject: [petsc-users] Install PETSc with option `--with-shared-libraries=1` failed on MacOS In-Reply-To: References: <93DAC71E-10F8-48FC-A2E1-DA90F64129DB@gmail.com> <333E0143-AFCA-41F6-BD72-EBCF88C6BFBD@joliv.et> <7C476492-F3A6-417B-9A71-61129CC00BB2@gmail.com> <4713306B-F97E-4853-9E86-20BA4C818810@joliv.et> <8178ED7E-487A-432C-9D9B-08A7707CCF87@gmail.com> <0A79AA1A-F2D6-4415-B668-CB7323B3B3F8@petsc.dev> Message-ID: <2D1AC7F9-E491-4B12-9F9F-4F848CE9243F@gmail.com> After removing OpenBLAS, everything is working fine. Thanks! Best wishes, Zongze > On 17 Mar 2024, at 22:23, Pierre Jolivet wrote: > > Ah, my bad, I misread linux-opt-arm as a macOS runner, no wonder the option is not helping? > Take Barry?s advice. > Furthermore, it looks like OpenBLAS people are steering in the opposite direction as us, by forcing the use of ld-classic https://urldefense.us/v3/__https://github.com/OpenMathLib/OpenBLAS/commit/103d6f4e42fbe532ae4ea48e8d90d7d792bc93d2__;!!G_uCfscf7eWS!ayl2seevZWjJZO9OMKEHMo_bZDQOX3aAF1By7x_1x9rbjOmrQ2DwsX1KitiUFsStTf9_ThHAAHkgb1dIJmTSF9Qa$ , so that?s another good argument in favor of -framework Accelerate. > > Thanks, > Pierre > > PS: anyone benchmarked those https://urldefense.us/v3/__https://developer.apple.com/documentation/accelerate/sparse_solvers__;!!G_uCfscf7eWS!ayl2seevZWjJZO9OMKEHMo_bZDQOX3aAF1By7x_1x9rbjOmrQ2DwsX1KitiUFsStTf9_ThHAAHkgb1dIJrPiSsuB$ ? I didn?t even know they existed. > >> On 17 Mar 2024, at 3:06?PM, Zongze Yang wrote: >> >> This Message Is From an External Sender >> This message came from outside your organization. >> Understood. Thank you for your advice. >> >> Best wishes, >> Zongze >> >>> On 17 Mar 2024, at 22:04, Barry Smith > wrote: >>> >>> >>> I would just avoid the --download-openblas option. The BLAS/LAPACK provided by Apple should perform fine, perhaps even better than OpenBLAS on your system. >>> >>> >>>> On Mar 17, 2024, at 9:58?AM, Zongze Yang > wrote: >>>> >>>> This Message Is From an External Sender >>>> This message came from outside your organization. >>>> Adding the flag `--download-openblas-make-options=TARGET=GENERIC` did not resolve the issue. The same error persisted. >>>> >>>> Best wishes, >>>> Zongze >>>> >>>>> On 17 Mar 2024, at 20:58, Pierre Jolivet > wrote: >>>>> >>>>> >>>>> >>>>>> On 17 Mar 2024, at 1:04?PM, Zongze Yang > wrote: >>>>>> >>>>>> Thank you for providing the instructions. I try the first option. >>>>>> >>>>>> Now, the error of the configuration is related to OpenBLAS. >>>>>> Add `--CFLAGS=-Wno-int-conversion` to configure command resolve this. Should this be reported to OpenBLAS? Or need to fix the configure in petsc? >>>>> >>>>> I see our linux-opt-arm runner is using the additional flag '--download-openblas-make-options=TARGET=GENERIC', could you maybe try to add that as well? >>>>> I don?t think there is much to fix on our end, OpenBLAS has been very broken lately on arm (current version is 0.3.26 but we can?t update because there is a huge performance regression which makes the pipeline timeout). >>>>> >>>>> Thanks, >>>>> Pierre >>>>> >>>>>> >>>>>> The configure.log is attached. The errors are show below: >>>>>> ``` >>>>>> src/lapack_wrappers.c:570:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>>>> RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>>>> ^~~~ >>>>>> & >>>>>> src/../inc/relapack.h:74:216: note: passing argument to parameter here >>>>>> void RELAPACK_sgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); >>>>>> ^ >>>>>> src/lapack_wrappers.c:583:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>>>> RELAPACK_dgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>>>> ^~~~ >>>>>> & >>>>>> src/../inc/relapack.h:75:221: note: passing argument to parameter here >>>>>> void RELAPACK_dgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); >>>>>> ^ >>>>>> src/lapack_wrappers.c:596:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>>>> RELAPACK_cgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>>>> ^~~~ >>>>>> & >>>>>> src/../inc/relapack.h:76:216: note: passing argument to parameter here >>>>>> void RELAPACK_cgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); >>>>>> ^ >>>>>> src/lapack_wrappers.c:609:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>>>> RELAPACK_zgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>>>> ^~~~ >>>>>> & >>>>>> src/../inc/relapack.h:77:221: note: passing argument to parameter here >>>>>> void RELAPACK_zgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); >>>>>> ^ >>>>>> 4 errors generated. >>>>>> ``` >>>>>> >>>>>> Best wishes, >>>>>> Zongze >>>>>> >>>>>> >>>>>> >>>>>>> On 17 Mar 2024, at 18:48, Pierre Jolivet > wrote: >>>>>>> >>>>>>> You need this MR https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7365__;!!G_uCfscf7eWS!ayl2seevZWjJZO9OMKEHMo_bZDQOX3aAF1By7x_1x9rbjOmrQ2DwsX1KitiUFsStTf9_ThHAAHkgb1dIJpz1zt2Q$ >>>>>>> main has been broken for macOS since https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7341__;!!G_uCfscf7eWS!ayl2seevZWjJZO9OMKEHMo_bZDQOX3aAF1By7x_1x9rbjOmrQ2DwsX1KitiUFsStTf9_ThHAAHkgb1dIJu7sVyGp$ , so the alternative is to revert to the commit prior. >>>>>>> It should work either way. >>>>>>> >>>>>>> Thanks, >>>>>>> Pierre >>>>>>> >>>>>>>> On 17 Mar 2024, at 11:31?AM, Zongze Yang > wrote: >>>>>>>> >>>>>>>> ? >>>>>>>> This Message Is From an External Sender >>>>>>>> This message came from outside your organization. >>>>>>>> Hi, PETSc Team, >>>>>>>> >>>>>>>> I am trying to install petsc with the following configuration >>>>>>>> ``` >>>>>>>> ./configure \ >>>>>>>> --download-bison \ >>>>>>>> --download-mpich \ >>>>>>>> --download-mpich-configure-arguments=--disable-opencl \ >>>>>>>> --download-hwloc \ >>>>>>>> --download-hwloc-configure-arguments=--disable-opencl \ >>>>>>>> --download-openblas \ >>>>>>>> --download-openblas-make-options="'USE_THREAD=0 USE_LOCKING=1 USE_OPENMP=0'" \ >>>>>>>> --with-shared-libraries=1 \ >>>>>>>> --with-fortran-bindings=0 \ >>>>>>>> --with-zlib \ >>>>>>>> LDFLAGS=-Wl,-ld_classic >>>>>>>> ``` >>>>>>>> >>>>>>>> The log shows that >>>>>>>> ``` >>>>>>>> Exhausted all shared linker guesses. Could not determine how to create a shared library! >>>>>>>> ``` >>>>>>>> >>>>>>>> I recently updated the system and Xcode, as well as homebrew. >>>>>>>> >>>>>>>> The configure.log is attached. >>>>>>>> >>>>>>>> Thanks for your attention to this matter. >>>>>>>> >>>>>>>> Best wishes, >>>>>>>> Zongze >>>>>>>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Sun Mar 17 11:23:10 2024 From: balay at mcs.anl.gov (Satish Balay) Date: Sun, 17 Mar 2024 11:23:10 -0500 (CDT) Subject: [petsc-users] Install PETSc with option `--with-shared-libraries=1` failed on MacOS In-Reply-To: References: <93DAC71E-10F8-48FC-A2E1-DA90F64129DB@gmail.com> <333E0143-AFCA-41F6-BD72-EBCF88C6BFBD@joliv.et> <7C476492-F3A6-417B-9A71-61129CC00BB2@gmail.com> <4713306B-F97E-4853-9E86-20BA4C818810@joliv.et> <8178ED7E-487A-432C-9D9B-08A7707CCF87@gmail.com> <0A79AA1A-F2D6-4415-B668-CB7323B3B3F8@petsc.dev> Message-ID: <8ef9433e-6061-2d88-970f-bd492a82bc0f@mcs.anl.gov> Hm - I just tried a build with balay/xcode15-mpich - and that goes through fine for me. So don't know what the difference here is. One difference is - I have a slightly older xcode. However your compiler appears to behave as using -Werror. Perhaps CFLAGS=-Wno-int-conversion will help here? Satish ---- Executing: gcc --version stdout: Apple clang version 15.0.0 (clang-1500.3.9.4) Executing: /Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/bin/mpicc -show stdout: gcc -fPIC -fno-stack-check -Qunused-arguments -g -O0 -Wno-implicit-function-declaration -fno-common -I/Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/include -L/Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/lib -lmpi -lpmpi /Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/bin/mpicc -O2 -DMAX_STACK_ALLOC=2048 -Wall -DF_INTERFACE_GFORT -fPIC -DNO_WARMUP -DMAX_CPU_NUMBER=12 -DMAX_PARALLEL_NUMBER=1 -DBUILD_SINGLE=1 -DBUILD_DOUBLE=1 -DBUILD_COMPLEX=1 -DBUILD_COMPLEX16=1 -DVERSION=\"0.3.21\" -march=armv8-a -UASMNAME -UASMFNAME -UNAME -UCNAME -UCHAR_NAME -UCHAR_CNAME -DASMNAME=_lapack_wrappers -DASMFNAME=_lapack_wrappers_ -DNAME=lapack_wrappers_ -DCNAME=lapack_wrappers -DCHAR_NAME=\"lapack_wrappers_\" -DCHAR_CNAME=\"lapack_wrappers\" -DNO_AFFINITY -I.. -c src/lapack_wrappers.c -o src/lapack_wrappers.o src/lapack_wrappers.c:570:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); ^~~~ & vs: Executing: gcc --version stdout: Apple clang version 15.0.0 (clang-1500.1.0.2.5) Executing: /Users/balay/petsc/arch-darwin-c-debug/bin/mpicc -show stdout: gcc -fPIC -fno-stack-check -Qunused-arguments -g -O0 -Wno-implicit-function-declaration -fno-common -I/Users/balay/petsc/arch-darwin-c-debug/include -L/Users/balay/petsc/arch-darwin-c-debug/lib -lmpi -lpmpi /Users/balay/petsc/arch-darwin-c-debug/bin/mpicc -O2 -DMAX_STACK_ALLOC=2048 -Wall -DF_INTERFACE_GFORT -fPIC -DNO_WARMUP -DMAX_CPU_NUMBER=24 -DMAX_PARALLEL_NUMBER=1 -DBUILD_SINGLE=1 -DBUILD_DOUBLE=1 -DBUILD_COMPLEX=1 -DBUILD_COMPLEX16=1 -DVERSION=\"0.3.21\" -march=armv8-a -UASMNAME -UASMFNAME -UNAME -UCNAME -UCHAR_NAME -UCHAR_CNAME -DASMNAME=_lapack_wrappers -DASMFNAME=_lapack_wrappers_ -DNAME=lapack_wrappers_ -DCNAME=lapack_wrappers -DCHAR_NAME=\"lapack_wrappers_\" -DCHAR_CNAME=\"lapack_wrappers\" -DNO_AFFINITY -I.. -c src/lapack_wrappers.c -o src/lapack_wrappers.o src/lapack_wrappers.c:570:81: warning: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); ^~~~ & On Sun, 17 Mar 2024, Pierre Jolivet wrote: > Ah, my bad, I misread linux-opt-arm as a macOS runner, no wonder the option is not helping? > Take Barry?s advice. > Furthermore, it looks like OpenBLAS people are steering in the opposite direction as us, by forcing the use of ld-classic https://urldefense.us/v3/__https://github.com/OpenMathLib/OpenBLAS/commit/103d6f4e42fbe532ae4ea48e8d90d7d792bc93d2__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SrazFoooQ$ , so that?s another good argument in favor of -framework Accelerate. > > Thanks, > Pierre > > PS: anyone benchmarked those https://urldefense.us/v3/__https://developer.apple.com/documentation/accelerate/sparse_solvers__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SrpnDvT5g$ ? I didn?t even know they existed. > > > On 17 Mar 2024, at 3:06?PM, Zongze Yang wrote: > > > > This Message Is From an External Sender > > This message came from outside your organization. > > Understood. Thank you for your advice. > > > > Best wishes, > > Zongze > > > >> On 17 Mar 2024, at 22:04, Barry Smith > wrote: > >> > >> > >> I would just avoid the --download-openblas option. The BLAS/LAPACK provided by Apple should perform fine, perhaps even better than OpenBLAS on your system. > >> > >> > >>> On Mar 17, 2024, at 9:58?AM, Zongze Yang > wrote: > >>> > >>> This Message Is From an External Sender > >>> This message came from outside your organization. > >>> Adding the flag `--download-openblas-make-options=TARGET=GENERIC` did not resolve the issue. The same error persisted. > >>> > >>> Best wishes, > >>> Zongze > >>> > >>>> On 17 Mar 2024, at 20:58, Pierre Jolivet > wrote: > >>>> > >>>> > >>>> > >>>>> On 17 Mar 2024, at 1:04?PM, Zongze Yang > wrote: > >>>>> > >>>>> Thank you for providing the instructions. I try the first option. > >>>>> > >>>>> Now, the error of the configuration is related to OpenBLAS. > >>>>> Add `--CFLAGS=-Wno-int-conversion` to configure command resolve this. Should this be reported to OpenBLAS? Or need to fix the configure in petsc? > >>>> > >>>> I see our linux-opt-arm runner is using the additional flag '--download-openblas-make-options=TARGET=GENERIC', could you maybe try to add that as well? > >>>> I don?t think there is much to fix on our end, OpenBLAS has been very broken lately on arm (current version is 0.3.26 but we can?t update because there is a huge performance regression which makes the pipeline timeout). > >>>> > >>>> Thanks, > >>>> Pierre > >>>> > >>>>> > >>>>> The configure.log is attached. The errors are show below: > >>>>> ``` > >>>>> src/lapack_wrappers.c:570:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > >>>>> RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > >>>>> ^~~~ > >>>>> & > >>>>> src/../inc/relapack.h:74:216: note: passing argument to parameter here > >>>>> void RELAPACK_sgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); > >>>>> ^ > >>>>> src/lapack_wrappers.c:583:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > >>>>> RELAPACK_dgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > >>>>> ^~~~ > >>>>> & > >>>>> src/../inc/relapack.h:75:221: note: passing argument to parameter here > >>>>> void RELAPACK_dgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); > >>>>> ^ > >>>>> src/lapack_wrappers.c:596:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > >>>>> RELAPACK_cgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > >>>>> ^~~~ > >>>>> & > >>>>> src/../inc/relapack.h:76:216: note: passing argument to parameter here > >>>>> void RELAPACK_cgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); > >>>>> ^ > >>>>> src/lapack_wrappers.c:609:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > >>>>> RELAPACK_zgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > >>>>> ^~~~ > >>>>> & > >>>>> src/../inc/relapack.h:77:221: note: passing argument to parameter here > >>>>> void RELAPACK_zgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); > >>>>> ^ > >>>>> 4 errors generated. > >>>>> ``` > >>>>> > >>>>> Best wishes, > >>>>> Zongze > >>>>> > >>>>> > >>>>> > >>>>>> On 17 Mar 2024, at 18:48, Pierre Jolivet > wrote: > >>>>>> > >>>>>> You need this MR https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7365__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SqG8HOUGQ$ > >>>>>> main has been broken for macOS since https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7341__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9Soe8Kh_uQ$ , so the alternative is to revert to the commit prior. > >>>>>> It should work either way. > >>>>>> > >>>>>> Thanks, > >>>>>> Pierre > >>>>>> > >>>>>>> On 17 Mar 2024, at 11:31?AM, Zongze Yang > wrote: > >>>>>>> > >>>>>>> ? > >>>>>>> This Message Is From an External Sender > >>>>>>> This message came from outside your organization. > >>>>>>> Hi, PETSc Team, > >>>>>>> > >>>>>>> I am trying to install petsc with the following configuration > >>>>>>> ``` > >>>>>>> ./configure \ > >>>>>>> --download-bison \ > >>>>>>> --download-mpich \ > >>>>>>> --download-mpich-configure-arguments=--disable-opencl \ > >>>>>>> --download-hwloc \ > >>>>>>> --download-hwloc-configure-arguments=--disable-opencl \ > >>>>>>> --download-openblas \ > >>>>>>> --download-openblas-make-options="'USE_THREAD=0 USE_LOCKING=1 USE_OPENMP=0'" \ > >>>>>>> --with-shared-libraries=1 \ > >>>>>>> --with-fortran-bindings=0 \ > >>>>>>> --with-zlib \ > >>>>>>> LDFLAGS=-Wl,-ld_classic > >>>>>>> ``` > >>>>>>> > >>>>>>> The log shows that > >>>>>>> ``` > >>>>>>> Exhausted all shared linker guesses. Could not determine how to create a shared library! > >>>>>>> ``` > >>>>>>> > >>>>>>> I recently updated the system and Xcode, as well as homebrew. > >>>>>>> > >>>>>>> The configure.log is attached. > >>>>>>> > >>>>>>> Thanks for your attention to this matter. > >>>>>>> > >>>>>>> Best wishes, > >>>>>>> Zongze > >>>>>>> > > -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log.gz Type: application/gzip Size: 1365006 bytes Desc: URL: From yangzongze at gmail.com Sun Mar 17 11:50:08 2024 From: yangzongze at gmail.com (Zongze Yang) Date: Mon, 18 Mar 2024 00:50:08 +0800 Subject: [petsc-users] Install PETSc with option `--with-shared-libraries=1` failed on MacOS In-Reply-To: <8ef9433e-6061-2d88-970f-bd492a82bc0f@mcs.anl.gov> References: <93DAC71E-10F8-48FC-A2E1-DA90F64129DB@gmail.com> <333E0143-AFCA-41F6-BD72-EBCF88C6BFBD@joliv.et> <7C476492-F3A6-417B-9A71-61129CC00BB2@gmail.com> <4713306B-F97E-4853-9E86-20BA4C818810@joliv.et> <8178ED7E-487A-432C-9D9B-08A7707CCF87@gmail.com> <0A79AA1A-F2D6-4415-B668-CB7323B3B3F8@petsc.dev> <8ef9433e-6061-2d88-970f-bd492a82bc0f@mcs.anl.gov> Message-ID: It can be resolved by adding CFLAGS=-Wno-int-conversion. Perhaps the default behaviour of the new version compiler has been changed? Best wishes, Zongze > On 18 Mar 2024, at 00:23, Satish Balay wrote: > > Hm - I just tried a build with balay/xcode15-mpich - and that goes through fine for me. So don't know what the difference here is. > > One difference is - I have a slightly older xcode. However your compiler appears to behave as using -Werror. Perhaps CFLAGS=-Wno-int-conversion will help here? > > Satish > > ---- > Executing: gcc --version > stdout: > Apple clang version 15.0.0 (clang-1500.3.9.4) > > Executing: /Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/bin/mpicc -show > stdout: gcc -fPIC -fno-stack-check -Qunused-arguments -g -O0 -Wno-implicit-function-declaration -fno-common -I/Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/include -L/Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/lib -lmpi -lpmpi > > /Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/bin/mpicc -O2 -DMAX_STACK_ALLOC=2048 -Wall -DF_INTERFACE_GFORT -fPIC -DNO_WARMUP -DMAX_CPU_NUMBER=12 -DMAX_PARALLEL_NUMBER=1 -DBUILD_SINGLE=1 -DBUILD_DOUBLE=1 -DBUILD_COMPLEX=1 -DBUILD_COMPLEX16=1 -DVERSION=\"0.3.21\" -march=armv8-a -UASMNAME -UASMFNAME -UNAME -UCNAME -UCHAR_NAME -UCHAR_CNAME -DASMNAME=_lapack_wrappers -DASMFNAME=_lapack_wrappers_ -DNAME=lapack_wrappers_ -DCNAME=lapack_wrappers -DCHAR_NAME=\"lapack_wrappers_\" -DCHAR_CNAME=\"lapack_wrappers\" -DNO_AFFINITY -I.. -c src/lapack_wrappers.c -o src/lapack_wrappers.o > src/lapack_wrappers.c:570:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > ^~~~ > & > > vs: > Executing: gcc --version > stdout: > Apple clang version 15.0.0 (clang-1500.1.0.2.5) > > Executing: /Users/balay/petsc/arch-darwin-c-debug/bin/mpicc -show > stdout: gcc -fPIC -fno-stack-check -Qunused-arguments -g -O0 -Wno-implicit-function-declaration -fno-common -I/Users/balay/petsc/arch-darwin-c-debug/include -L/Users/balay/petsc/arch-darwin-c-debug/lib -lmpi -lpmpi > > > /Users/balay/petsc/arch-darwin-c-debug/bin/mpicc -O2 -DMAX_STACK_ALLOC=2048 -Wall -DF_INTERFACE_GFORT -fPIC -DNO_WARMUP -DMAX_CPU_NUMBER=24 -DMAX_PARALLEL_NUMBER=1 -DBUILD_SINGLE=1 -DBUILD_DOUBLE=1 -DBUILD_COMPLEX=1 -DBUILD_COMPLEX16=1 -DVERSION=\"0.3.21\" -march=armv8-a -UASMNAME -UASMFNAME -UNAME -UCNAME -UCHAR_NAME -UCHAR_CNAME -DASMNAME=_lapack_wrappers -DASMFNAME=_lapack_wrappers_ -DNAME=lapack_wrappers_ -DCNAME=lapack_wrappers -DCHAR_NAME=\"lapack_wrappers_\" -DCHAR_CNAME=\"lapack_wrappers\" -DNO_AFFINITY -I.. -c src/lapack_wrappers.c -o src/lapack_wrappers.o > src/lapack_wrappers.c:570:81: warning: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > ^~~~ > & > > > > > On Sun, 17 Mar 2024, Pierre Jolivet wrote: > >> Ah, my bad, I misread linux-opt-arm as a macOS runner, no wonder the option is not helping? >> Take Barry?s advice. >> Furthermore, it looks like OpenBLAS people are steering in the opposite direction as us, by forcing the use of ld-classic https://urldefense.us/v3/__https://github.com/OpenMathLib/OpenBLAS/commit/103d6f4e42fbe532ae4ea48e8d90d7d792bc93d2__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SrazFoooQ$ , so that?s another good argument in favor of -framework Accelerate. >> >> Thanks, >> Pierre >> >> PS: anyone benchmarked those https://urldefense.us/v3/__https://developer.apple.com/documentation/accelerate/sparse_solvers__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SrpnDvT5g$ ? I didn?t even know they existed. >> >>> On 17 Mar 2024, at 3:06?PM, Zongze Yang > wrote: >>> >>> This Message Is From an External Sender >>> This message came from outside your organization. >>> Understood. Thank you for your advice. >>> >>> Best wishes, >>> Zongze >>> >>>> On 17 Mar 2024, at 22:04, Barry Smith > wrote: >>>> >>>> >>>> I would just avoid the --download-openblas option. The BLAS/LAPACK provided by Apple should perform fine, perhaps even better than OpenBLAS on your system. >>>> >>>> >>>>> On Mar 17, 2024, at 9:58?AM, Zongze Yang > wrote: >>>>> >>>>> This Message Is From an External Sender >>>>> This message came from outside your organization. >>>>> Adding the flag `--download-openblas-make-options=TARGET=GENERIC` did not resolve the issue. The same error persisted. >>>>> >>>>> Best wishes, >>>>> Zongze >>>>> >>>>>> On 17 Mar 2024, at 20:58, Pierre Jolivet > wrote: >>>>>> >>>>>> >>>>>> >>>>>>> On 17 Mar 2024, at 1:04?PM, Zongze Yang > wrote: >>>>>>> >>>>>>> Thank you for providing the instructions. I try the first option. >>>>>>> >>>>>>> Now, the error of the configuration is related to OpenBLAS. >>>>>>> Add `--CFLAGS=-Wno-int-conversion` to configure command resolve this. Should this be reported to OpenBLAS? Or need to fix the configure in petsc? >>>>>> >>>>>> I see our linux-opt-arm runner is using the additional flag '--download-openblas-make-options=TARGET=GENERIC', could you maybe try to add that as well? >>>>>> I don?t think there is much to fix on our end, OpenBLAS has been very broken lately on arm (current version is 0.3.26 but we can?t update because there is a huge performance regression which makes the pipeline timeout). >>>>>> >>>>>> Thanks, >>>>>> Pierre >>>>>> >>>>>>> >>>>>>> The configure.log is attached. The errors are show below: >>>>>>> ``` >>>>>>> src/lapack_wrappers.c:570:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>>>>> RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>>>>> ^~~~ >>>>>>> & >>>>>>> src/../inc/relapack.h:74:216: note: passing argument to parameter here >>>>>>> void RELAPACK_sgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); >>>>>>> ^ >>>>>>> src/lapack_wrappers.c:583:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>>>>> RELAPACK_dgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>>>>> ^~~~ >>>>>>> & >>>>>>> src/../inc/relapack.h:75:221: note: passing argument to parameter here >>>>>>> void RELAPACK_dgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); >>>>>>> ^ >>>>>>> src/lapack_wrappers.c:596:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>>>>> RELAPACK_cgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>>>>> ^~~~ >>>>>>> & >>>>>>> src/../inc/relapack.h:76:216: note: passing argument to parameter here >>>>>>> void RELAPACK_cgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); >>>>>>> ^ >>>>>>> src/lapack_wrappers.c:609:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>>>>> RELAPACK_zgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>>>>> ^~~~ >>>>>>> & >>>>>>> src/../inc/relapack.h:77:221: note: passing argument to parameter here >>>>>>> void RELAPACK_zgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); >>>>>>> ^ >>>>>>> 4 errors generated. >>>>>>> ``` >>>>>>> >>>>>>> Best wishes, >>>>>>> Zongze >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On 17 Mar 2024, at 18:48, Pierre Jolivet > wrote: >>>>>>>> >>>>>>>> You need this MR https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7365__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SqG8HOUGQ$ >>>>>>>> main has been broken for macOS since https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7341__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9Soe8Kh_uQ$ , so the alternative is to revert to the commit prior. >>>>>>>> It should work either way. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Pierre >>>>>>>> >>>>>>>>> On 17 Mar 2024, at 11:31?AM, Zongze Yang > wrote: >>>>>>>>> >>>>>>>>> ? >>>>>>>>> This Message Is From an External Sender >>>>>>>>> This message came from outside your organization. >>>>>>>>> Hi, PETSc Team, >>>>>>>>> >>>>>>>>> I am trying to install petsc with the following configuration >>>>>>>>> ``` >>>>>>>>> ./configure \ >>>>>>>>> --download-bison \ >>>>>>>>> --download-mpich \ >>>>>>>>> --download-mpich-configure-arguments=--disable-opencl \ >>>>>>>>> --download-hwloc \ >>>>>>>>> --download-hwloc-configure-arguments=--disable-opencl \ >>>>>>>>> --download-openblas \ >>>>>>>>> --download-openblas-make-options="'USE_THREAD=0 USE_LOCKING=1 USE_OPENMP=0'" \ >>>>>>>>> --with-shared-libraries=1 \ >>>>>>>>> --with-fortran-bindings=0 \ >>>>>>>>> --with-zlib \ >>>>>>>>> LDFLAGS=-Wl,-ld_classic >>>>>>>>> ``` >>>>>>>>> >>>>>>>>> The log shows that >>>>>>>>> ``` >>>>>>>>> Exhausted all shared linker guesses. Could not determine how to create a shared library! >>>>>>>>> ``` >>>>>>>>> >>>>>>>>> I recently updated the system and Xcode, as well as homebrew. >>>>>>>>> >>>>>>>>> The configure.log is attached. >>>>>>>>> >>>>>>>>> Thanks for your attention to this matter. >>>>>>>>> >>>>>>>>> Best wishes, >>>>>>>>> Zongze >>>>>>>>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From 202321009113 at mail.scut.edu.cn Sun Mar 17 22:57:58 2024 From: 202321009113 at mail.scut.edu.cn (=?UTF-8?B?56iL5aWU?=) Date: Mon, 18 Mar 2024 11:57:58 +0800 (GMT+08:00) Subject: [petsc-users] Using PetscPartitioner on WINDOWS Message-ID: <51939dcb.1af6.18e4fb57e91.Coremail.202321009113@mail.scut.edu.cn> Hello? Recently I try to install PETSc with Cygwin since I'd like to use PETSc with Visual Studio on Windows10 plateform.For the sake of clarity, I firstly list the softwares/packages used below: 1. PETSc: version 3.16.5 2. VS: version 2022 3. Intel MPI: download Intel oneAPI Base Toolkit and HPC Toolkit 4. Cygwin On windows, Then I try to calculate a simple cantilever beam that use Tetrahedral mesh. So it's unstructured grid I use DMPlexCreateFromFile() to creat dmplex. And then I want to distributing the mesh for using PETSCPARTITIONERPARMETIS type(in my opinion this PetscPartitioner type maybe the best for dmplex, see fig 1 for my work to see different PetscPartitioner type about a cantilever beam in Linux system.) But unfortunatly, when i try to use parmetis on windows that configure PETSc as follows ./configure --with-debugging=0 --with-cc='win32fe cl' --with-fc='win32fe ifort' --with-cxx='win32fe cl' --download-fblaslapack=/cygdrive/g/mypetsc/petsc-pkg-fblaslapack-e8a03f57d64c.tar.gz --with-shared-libraries=0 --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include --with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/lib/release/impi.lib --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-475d8facbb32.tar.gz --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-ca7a59e6283f.tar.gz it shows that ******************************************************************************* External package metis does not support --download-metis with Microsoft compilers ******************************************************************************* configure.log and make.log is attached If I use PetscPartitioner Simple type the calculate time is much more than PETSCPARTITIONERPARMETIS type. So On windows system I want to use PetscPartitioner like parmetis , if there have any other PetscPartitioner type that can do the same work as parmetis, or I just try to download parmetis separatly on windows(like this website , https://urldefense.us/v3/__https://boogie.inm.ras.ru/terekhov/INMOST/-/wikis/0204-Compilation-ParMETIS-Windows__;!!G_uCfscf7eWS!bpnh34xNHVqfdKTl-ggRXSax29UDMeTVK_E0bHs5J3_g1-RuvJhSbmxD6SKjngZpPiIKpgrv3h4WWOt9lEF-Fb8A4UZELMSi-MSW$ ) and then use Visual Studio to use it's library I don't know in this way PETSc could use it successfully or not. So I wrrit this email to report my problem and ask for your help. Looking forward your reply! sinserely, Ben. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: figure1-Type of PetscPartitioner.png Type: image/png Size: 130155 bytes Desc: not available URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: configure.log URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: make.log URL: From yangzongze at gmail.com Mon Mar 18 00:26:46 2024 From: yangzongze at gmail.com (Zongze Yang) Date: Mon, 18 Mar 2024 13:26:46 +0800 Subject: [petsc-users] Install PETSc with option `--with-shared-libraries=1` failed on MacOS In-Reply-To: References: <93DAC71E-10F8-48FC-A2E1-DA90F64129DB@gmail.com> <333E0143-AFCA-41F6-BD72-EBCF88C6BFBD@joliv.et> <7C476492-F3A6-417B-9A71-61129CC00BB2@gmail.com> <4713306B-F97E-4853-9E86-20BA4C818810@joliv.et> <8178ED7E-487A-432C-9D9B-08A7707CCF87@gmail.com> <0A79AA1A-F2D6-4415-B668-CB7323B3B3F8@petsc.dev> <8ef9433e-6061-2d88-970f-bd492a82bc0f@mcs.anl.gov> Message-ID: The issue of openblas was resolved by this pr https://urldefense.us/v3/__https://github.com/OpenMathLib/OpenBLAS/pull/4565__;!!G_uCfscf7eWS!b09n5clcTFuLceLY_9KfqtSsgmmCIBLFbqciRVCKvnvFw9zTaNF8ssK0MiQlBOXUJe7H88nl-7ExdfhB-cMXLQ2d$ Best wishes, Zongze > On 18 Mar 2024, at 00:50, Zongze Yang wrote: > > It can be resolved by adding CFLAGS=-Wno-int-conversion. Perhaps the default behaviour of the new version compiler has been changed? > > Best wishes, > Zongze >> On 18 Mar 2024, at 00:23, Satish Balay wrote: >> >> Hm - I just tried a build with balay/xcode15-mpich - and that goes through fine for me. So don't know what the difference here is. >> >> One difference is - I have a slightly older xcode. However your compiler appears to behave as using -Werror. Perhaps CFLAGS=-Wno-int-conversion will help here? >> >> Satish >> >> ---- >> Executing: gcc --version >> stdout: >> Apple clang version 15.0.0 (clang-1500.3.9.4) >> >> Executing: /Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/bin/mpicc -show >> stdout: gcc -fPIC -fno-stack-check -Qunused-arguments -g -O0 -Wno-implicit-function-declaration -fno-common -I/Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/include -L/Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/lib -lmpi -lpmpi >> >> /Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/bin/mpicc -O2 -DMAX_STACK_ALLOC=2048 -Wall -DF_INTERFACE_GFORT -fPIC -DNO_WARMUP -DMAX_CPU_NUMBER=12 -DMAX_PARALLEL_NUMBER=1 -DBUILD_SINGLE=1 -DBUILD_DOUBLE=1 -DBUILD_COMPLEX=1 -DBUILD_COMPLEX16=1 -DVERSION=\"0.3.21\" -march=armv8-a -UASMNAME -UASMFNAME -UNAME -UCNAME -UCHAR_NAME -UCHAR_CNAME -DASMNAME=_lapack_wrappers -DASMFNAME=_lapack_wrappers_ -DNAME=lapack_wrappers_ -DCNAME=lapack_wrappers -DCHAR_NAME=\"lapack_wrappers_\" -DCHAR_CNAME=\"lapack_wrappers\" -DNO_AFFINITY -I.. -c src/lapack_wrappers.c -o src/lapack_wrappers.o >> src/lapack_wrappers.c:570:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >> RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >> ^~~~ >> & >> >> vs: >> Executing: gcc --version >> stdout: >> Apple clang version 15.0.0 (clang-1500.1.0.2.5) >> >> Executing: /Users/balay/petsc/arch-darwin-c-debug/bin/mpicc -show >> stdout: gcc -fPIC -fno-stack-check -Qunused-arguments -g -O0 -Wno-implicit-function-declaration -fno-common -I/Users/balay/petsc/arch-darwin-c-debug/include -L/Users/balay/petsc/arch-darwin-c-debug/lib -lmpi -lpmpi >> >> >> /Users/balay/petsc/arch-darwin-c-debug/bin/mpicc -O2 -DMAX_STACK_ALLOC=2048 -Wall -DF_INTERFACE_GFORT -fPIC -DNO_WARMUP -DMAX_CPU_NUMBER=24 -DMAX_PARALLEL_NUMBER=1 -DBUILD_SINGLE=1 -DBUILD_DOUBLE=1 -DBUILD_COMPLEX=1 -DBUILD_COMPLEX16=1 -DVERSION=\"0.3.21\" -march=armv8-a -UASMNAME -UASMFNAME -UNAME -UCNAME -UCHAR_NAME -UCHAR_CNAME -DASMNAME=_lapack_wrappers -DASMFNAME=_lapack_wrappers_ -DNAME=lapack_wrappers_ -DCNAME=lapack_wrappers -DCHAR_NAME=\"lapack_wrappers_\" -DCHAR_CNAME=\"lapack_wrappers\" -DNO_AFFINITY -I.. -c src/lapack_wrappers.c -o src/lapack_wrappers.o >> src/lapack_wrappers.c:570:81: warning: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >> RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >> ^~~~ >> & >> >> >> >> >> On Sun, 17 Mar 2024, Pierre Jolivet wrote: >> >>> Ah, my bad, I misread linux-opt-arm as a macOS runner, no wonder the option is not helping? >>> Take Barry?s advice. >>> Furthermore, it looks like OpenBLAS people are steering in the opposite direction as us, by forcing the use of ld-classic https://urldefense.us/v3/__https://github.com/OpenMathLib/OpenBLAS/commit/103d6f4e42fbe532ae4ea48e8d90d7d792bc93d2__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SrazFoooQ$ , so that?s another good argument in favor of -framework Accelerate. >>> >>> Thanks, >>> Pierre >>> >>> PS: anyone benchmarked those https://urldefense.us/v3/__https://developer.apple.com/documentation/accelerate/sparse_solvers__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SrpnDvT5g$ ? I didn?t even know they existed. >>> >>>> On 17 Mar 2024, at 3:06?PM, Zongze Yang > wrote: >>>> >>>> This Message Is From an External Sender >>>> This message came from outside your organization. >>>> Understood. Thank you for your advice. >>>> >>>> Best wishes, >>>> Zongze >>>> >>>>> On 17 Mar 2024, at 22:04, Barry Smith > wrote: >>>>> >>>>> >>>>> I would just avoid the --download-openblas option. The BLAS/LAPACK provided by Apple should perform fine, perhaps even better than OpenBLAS on your system. >>>>> >>>>> >>>>>> On Mar 17, 2024, at 9:58?AM, Zongze Yang > wrote: >>>>>> >>>>>> This Message Is From an External Sender >>>>>> This message came from outside your organization. >>>>>> Adding the flag `--download-openblas-make-options=TARGET=GENERIC` did not resolve the issue. The same error persisted. >>>>>> >>>>>> Best wishes, >>>>>> Zongze >>>>>> >>>>>>> On 17 Mar 2024, at 20:58, Pierre Jolivet > wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On 17 Mar 2024, at 1:04?PM, Zongze Yang > wrote: >>>>>>>> >>>>>>>> Thank you for providing the instructions. I try the first option. >>>>>>>> >>>>>>>> Now, the error of the configuration is related to OpenBLAS. >>>>>>>> Add `--CFLAGS=-Wno-int-conversion` to configure command resolve this. Should this be reported to OpenBLAS? Or need to fix the configure in petsc? >>>>>>> >>>>>>> I see our linux-opt-arm runner is using the additional flag '--download-openblas-make-options=TARGET=GENERIC', could you maybe try to add that as well? >>>>>>> I don?t think there is much to fix on our end, OpenBLAS has been very broken lately on arm (current version is 0.3.26 but we can?t update because there is a huge performance regression which makes the pipeline timeout). >>>>>>> >>>>>>> Thanks, >>>>>>> Pierre >>>>>>> >>>>>>>> >>>>>>>> The configure.log is attached. The errors are show below: >>>>>>>> ``` >>>>>>>> src/lapack_wrappers.c:570:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>>>>>> RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>>>>>> ^~~~ >>>>>>>> & >>>>>>>> src/../inc/relapack.h:74:216: note: passing argument to parameter here >>>>>>>> void RELAPACK_sgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); >>>>>>>> ^ >>>>>>>> src/lapack_wrappers.c:583:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>>>>>> RELAPACK_dgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>>>>>> ^~~~ >>>>>>>> & >>>>>>>> src/../inc/relapack.h:75:221: note: passing argument to parameter here >>>>>>>> void RELAPACK_dgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); >>>>>>>> ^ >>>>>>>> src/lapack_wrappers.c:596:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>>>>>> RELAPACK_cgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>>>>>> ^~~~ >>>>>>>> & >>>>>>>> src/../inc/relapack.h:76:216: note: passing argument to parameter here >>>>>>>> void RELAPACK_cgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); >>>>>>>> ^ >>>>>>>> src/lapack_wrappers.c:609:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>>>>>> RELAPACK_zgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>>>>>> ^~~~ >>>>>>>> & >>>>>>>> src/../inc/relapack.h:77:221: note: passing argument to parameter here >>>>>>>> void RELAPACK_zgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); >>>>>>>> ^ >>>>>>>> 4 errors generated. >>>>>>>> ``` >>>>>>>> >>>>>>>> Best wishes, >>>>>>>> Zongze >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> On 17 Mar 2024, at 18:48, Pierre Jolivet > wrote: >>>>>>>>> >>>>>>>>> You need this MR https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7365__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SqG8HOUGQ$ >>>>>>>>> main has been broken for macOS since https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7341__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9Soe8Kh_uQ$ , so the alternative is to revert to the commit prior. >>>>>>>>> It should work either way. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Pierre >>>>>>>>> >>>>>>>>>> On 17 Mar 2024, at 11:31?AM, Zongze Yang > wrote: >>>>>>>>>> >>>>>>>>>> ? >>>>>>>>>> This Message Is From an External Sender >>>>>>>>>> This message came from outside your organization. >>>>>>>>>> Hi, PETSc Team, >>>>>>>>>> >>>>>>>>>> I am trying to install petsc with the following configuration >>>>>>>>>> ``` >>>>>>>>>> ./configure \ >>>>>>>>>> --download-bison \ >>>>>>>>>> --download-mpich \ >>>>>>>>>> --download-mpich-configure-arguments=--disable-opencl \ >>>>>>>>>> --download-hwloc \ >>>>>>>>>> --download-hwloc-configure-arguments=--disable-opencl \ >>>>>>>>>> --download-openblas \ >>>>>>>>>> --download-openblas-make-options="'USE_THREAD=0 USE_LOCKING=1 USE_OPENMP=0'" \ >>>>>>>>>> --with-shared-libraries=1 \ >>>>>>>>>> --with-fortran-bindings=0 \ >>>>>>>>>> --with-zlib \ >>>>>>>>>> LDFLAGS=-Wl,-ld_classic >>>>>>>>>> ``` >>>>>>>>>> >>>>>>>>>> The log shows that >>>>>>>>>> ``` >>>>>>>>>> Exhausted all shared linker guesses. Could not determine how to create a shared library! >>>>>>>>>> ``` >>>>>>>>>> >>>>>>>>>> I recently updated the system and Xcode, as well as homebrew. >>>>>>>>>> >>>>>>>>>> The configure.log is attached. >>>>>>>>>> >>>>>>>>>> Thanks for your attention to this matter. >>>>>>>>>> >>>>>>>>>> Best wishes, >>>>>>>>>> Zongze >>>>>>>>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jl2862237661 at gmail.com Mon Mar 18 07:06:48 2024 From: jl2862237661 at gmail.com (Waltz Jan) Date: Mon, 18 Mar 2024 20:06:48 +0800 Subject: [petsc-users] MatSetValues() can't work right Message-ID: PETSc version: 3.20.4 Program: #include #include #include #include #include #include #include int main() { PetscInitialize(NULL, NULL, NULL, NULL); DM da; DMDACreate3d(PETSC_COMM_WORLD, DM_BOUNDARY_GHOSTED, DM_BOUNDARY_GHOSTED, DM_BOUNDARY_GHOSTED, DMDA_STENCIL_STAR, 10, 1, 10, PETSC_DECIDE, PETSC_DECIDE, PETSC_DECIDE, 3, 1, NULL, NULL, NULL, &da); DMSetFromOptions(da); DMSetUp(da); Mat Jac; DMCreateMatrix(da, &Jac); int row = 100, col = 100; double val = 1.; MatSetValues(Jac, 1, &row, 1, &col, &val, INSERT_VALUES); MatAssemblyBegin(Jac, MAT_FINAL_ASSEMBLY); MatAssemblyEnd(Jac, MAT_FINAL_ASSEMBLY); PetscViewer viewer; PetscViewerASCIIOpen(PETSC_COMM_WORLD, "./jacobianmatrix.m", &viewer); PetscViewerPushFormat(viewer, PETSC_VIEWER_ASCII_MATLAB); MatView(Jac, viewer); PetscViewerDestroy(&viewer); PetscFinalize(); } When I ran the program with np = 6, I got the result as the below [image: image.png] It's obviously wrong. When I ran the program with np = 1 or 8, I got the right result as [image: image.png] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 51007 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 47126 bytes Desc: not available URL: From bsmith at petsc.dev Mon Mar 18 08:11:14 2024 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 18 Mar 2024 09:11:14 -0400 Subject: [petsc-users] Using PetscPartitioner on WINDOWS In-Reply-To: <51939dcb.1af6.18e4fb57e91.Coremail.202321009113@mail.scut.edu.cn> References: <51939dcb.1af6.18e4fb57e91.Coremail.202321009113@mail.scut.edu.cn> Message-ID: Please switch to the latest PETSc version, it supports Metis and Parmetis on Windows. Barry > On Mar 17, 2024, at 11:57?PM, ?? <202321009113 at mail.scut.edu.cn> wrote: > > This Message Is From an External Sender > This message came from outside your organization. > Hello? > > Recently I try to install PETSc with Cygwin since I'd like to use PETSc with Visual Studio on Windows10 plateform.For the sake of clarity, I firstly list the softwares/packages used below: > 1. PETSc: version 3.16.5 > 2. VS: version 2022 > 3. Intel MPI: download Intel oneAPI Base Toolkit and HPC Toolkit > 4. Cygwin > > > On windows, > Then I try to calculate a simple cantilever beam that use Tetrahedral mesh. So it's unstructured grid > I use DMPlexCreateFromFile() to creat dmplex. > And then I want to distributing the mesh for using PETSCPARTITIONERPARMETIS type(in my opinion this PetscPartitioner type maybe the best for dmplex, > > see fig 1 for my work to see different PetscPartitioner type about a cantilever beam in Linux system.) > > But unfortunatly, when i try to use parmetis on windows that configure PETSc as follows > > > ./configure --with-debugging=0 --with-cc='win32fe cl' --with-fc='win32fe ifort' --with-cxx='win32fe cl' > > --download-fblaslapack=/cygdrive/g/mypetsc/petsc-pkg-fblaslapack-e8a03f57d64c.tar.gz --with-shared-libraries=0 > > --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include > --with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/lib/release/impi.lib > --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec > --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-475d8facbb32.tar.gz > --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-ca7a59e6283f.tar.gz > > > > > it shows that > ******************************************************************************* > External package metis does not support --download-metis with Microsoft compilers > ******************************************************************************* > configure.log and make.log is attached > > > > If I use PetscPartitioner Simple type the calculate time is much more than PETSCPARTITIONERPARMETIS type. > So On windows system I want to use PetscPartitioner like parmetis , if there have any other PetscPartitioner type that can do the same work as parmetis, > > or I just try to download parmetis separatly on windows(like this website , https://urldefense.us/v3/__https://boogie.inm.ras.ru/terekhov/INMOST/-/wikis/0204-Compilation-ParMETIS-Windows__;!!G_uCfscf7eWS!aQuBcC4tH7O2WJoZkJWAnLHZplCB2W8UcXfvQSouKHeLkTk8v4zBycDCdUN6Xa3w9NCQcanI2isN-FopN4gfXyE$ )? > and then use Visual Studio to use it's library I don't know in this way PETSc could use it successfully or not. > > > > So I wrrit this email to report my problem and ask for your help. > > Looking forward your reply! > > > sinserely, > Ben. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Mar 18 08:28:36 2024 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 18 Mar 2024 09:28:36 -0400 Subject: [petsc-users] MatSetValues() can't work right In-Reply-To: References: Message-ID: <7E1E2114-7D96-4382-BC9A-438C78429F23@petsc.dev> The output is correct (only confusing). For PETSc DMDA by default viewing a parallel matrix converts it to the "natural" ordering instead of the PETSc parallel ordering. See the Notes in https://urldefense.us/v3/__https://petsc.org/release/manualpages/DM/DMCreateMatrix/__;!!G_uCfscf7eWS!dHznAiOiU4NDCipIaS1et2IIGx1u779XYQmMGk4EeeLQf41tAhbciI4ne1JKfOR0jG5WCsFm7dRWuEy6KdNhM5w$ Barry > On Mar 18, 2024, at 8:06?AM, Waltz Jan wrote: > > This Message Is From an External Sender > This message came from outside your organization. > PETSc version: 3.20.4 > Program: > #include > #include > #include > #include > #include > #include > #include > > int main() > { > PetscInitialize(NULL, NULL, NULL, NULL); > DM da; > DMDACreate3d(PETSC_COMM_WORLD, DM_BOUNDARY_GHOSTED, DM_BOUNDARY_GHOSTED, DM_BOUNDARY_GHOSTED, DMDA_STENCIL_STAR, > 10, 1, 10, PETSC_DECIDE, PETSC_DECIDE, PETSC_DECIDE, 3, 1, NULL, NULL, NULL, &da); > DMSetFromOptions(da); > DMSetUp(da); > Mat Jac; > DMCreateMatrix(da, &Jac); > int row = 100, col = 100; > double val = 1.; > MatSetValues(Jac, 1, &row, 1, &col, &val, INSERT_VALUES); > MatAssemblyBegin(Jac, MAT_FINAL_ASSEMBLY); > MatAssemblyEnd(Jac, MAT_FINAL_ASSEMBLY); > > PetscViewer viewer; > PetscViewerASCIIOpen(PETSC_COMM_WORLD, "./jacobianmatrix.m", &viewer); > PetscViewerPushFormat(viewer, PETSC_VIEWER_ASCII_MATLAB); > MatView(Jac, viewer); > PetscViewerDestroy(&viewer); > > PetscFinalize(); > } > > When I ran the program with np = 6, I got the result as the below > > It's obviously wrong. > When I ran the program with np = 1 or 8, I got the right result as > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Mon Mar 18 11:13:02 2024 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 18 Mar 2024 11:13:02 -0500 (CDT) Subject: [petsc-users] Install PETSc with option `--with-shared-libraries=1` failed on MacOS In-Reply-To: References: <93DAC71E-10F8-48FC-A2E1-DA90F64129DB@gmail.com> <333E0143-AFCA-41F6-BD72-EBCF88C6BFBD@joliv.et> <7C476492-F3A6-417B-9A71-61129CC00BB2@gmail.com> <4713306B-F97E-4853-9E86-20BA4C818810@joliv.et> <8178ED7E-487A-432C-9D9B-08A7707CCF87@gmail.com> <0A79AA1A-F2D6-4415-B668-CB7323B3B3F8@petsc.dev> <8ef9433e-6061-2d88-970f-bd492a82bc0f@mcs.anl.gov> Message-ID: Ah - the compiler did flag code bugs. > (current version is 0.3.26 but we can?t update because there is a huge performance regression which makes the pipeline timeout) maybe we should retry - updating to the latest snapshot and see if this issue persists. Satish On Mon, 18 Mar 2024, Zongze Yang wrote: > The issue of openblas was resolved by this pr https://urldefense.us/v3/__https://github.com/OpenMathLib/OpenBLAS/pull/4565__;!!G_uCfscf7eWS!b09n5clcTFuLceLY_9KfqtSsgmmCIBLFbqciRVCKvnvFw9zTaNF8ssK0MiQlBOXUJe7H88nl-7ExdfhB-cMXLQ2d$ > > Best wishes, > Zongze > > > On 18 Mar 2024, at 00:50, Zongze Yang wrote: > > > > It can be resolved by adding CFLAGS=-Wno-int-conversion. Perhaps the default behaviour of the new version compiler has been changed? > > > > Best wishes, > > Zongze > >> On 18 Mar 2024, at 00:23, Satish Balay wrote: > >> > >> Hm - I just tried a build with balay/xcode15-mpich - and that goes through fine for me. So don't know what the difference here is. > >> > >> One difference is - I have a slightly older xcode. However your compiler appears to behave as using -Werror. Perhaps CFLAGS=-Wno-int-conversion will help here? > >> > >> Satish > >> > >> ---- > >> Executing: gcc --version > >> stdout: > >> Apple clang version 15.0.0 (clang-1500.3.9.4) > >> > >> Executing: /Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/bin/mpicc -show > >> stdout: gcc -fPIC -fno-stack-check -Qunused-arguments -g -O0 -Wno-implicit-function-declaration -fno-common -I/Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/include -L/Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/lib -lmpi -lpmpi > >> > >> /Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/bin/mpicc -O2 -DMAX_STACK_ALLOC=2048 -Wall -DF_INTERFACE_GFORT -fPIC -DNO_WARMUP -DMAX_CPU_NUMBER=12 -DMAX_PARALLEL_NUMBER=1 -DBUILD_SINGLE=1 -DBUILD_DOUBLE=1 -DBUILD_COMPLEX=1 -DBUILD_COMPLEX16=1 -DVERSION=\"0.3.21\" -march=armv8-a -UASMNAME -UASMFNAME -UNAME -UCNAME -UCHAR_NAME -UCHAR_CNAME -DASMNAME=_lapack_wrappers -DASMFNAME=_lapack_wrappers_ -DNAME=lapack_wrappers_ -DCNAME=lapack_wrappers -DCHAR_NAME=\"lapack_wrappers_\" -DCHAR_CNAME=\"lapack_wrappers\" -DNO_AFFINITY -I.. -c src/lapack_wrappers.c -o src/lapack_wrappers.o > >> src/lapack_wrappers.c:570:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > >> RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > >> ^~~~ > >> & > >> > >> vs: > >> Executing: gcc --version > >> stdout: > >> Apple clang version 15.0.0 (clang-1500.1.0.2.5) > >> > >> Executing: /Users/balay/petsc/arch-darwin-c-debug/bin/mpicc -show > >> stdout: gcc -fPIC -fno-stack-check -Qunused-arguments -g -O0 -Wno-implicit-function-declaration -fno-common -I/Users/balay/petsc/arch-darwin-c-debug/include -L/Users/balay/petsc/arch-darwin-c-debug/lib -lmpi -lpmpi > >> > >> > >> /Users/balay/petsc/arch-darwin-c-debug/bin/mpicc -O2 -DMAX_STACK_ALLOC=2048 -Wall -DF_INTERFACE_GFORT -fPIC -DNO_WARMUP -DMAX_CPU_NUMBER=24 -DMAX_PARALLEL_NUMBER=1 -DBUILD_SINGLE=1 -DBUILD_DOUBLE=1 -DBUILD_COMPLEX=1 -DBUILD_COMPLEX16=1 -DVERSION=\"0.3.21\" -march=armv8-a -UASMNAME -UASMFNAME -UNAME -UCNAME -UCHAR_NAME -UCHAR_CNAME -DASMNAME=_lapack_wrappers -DASMFNAME=_lapack_wrappers_ -DNAME=lapack_wrappers_ -DCNAME=lapack_wrappers -DCHAR_NAME=\"lapack_wrappers_\" -DCHAR_CNAME=\"lapack_wrappers\" -DNO_AFFINITY -I.. -c src/lapack_wrappers.c -o src/lapack_wrappers.o > >> src/lapack_wrappers.c:570:81: warning: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > >> RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > >> ^~~~ > >> & > >> > >> > >> > >> > >> On Sun, 17 Mar 2024, Pierre Jolivet wrote: > >> > >>> Ah, my bad, I misread linux-opt-arm as a macOS runner, no wonder the option is not helping? > >>> Take Barry?s advice. > >>> Furthermore, it looks like OpenBLAS people are steering in the opposite direction as us, by forcing the use of ld-classic https://urldefense.us/v3/__https://github.com/OpenMathLib/OpenBLAS/commit/103d6f4e42fbe532ae4ea48e8d90d7d792bc93d2__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SrazFoooQ$ , so that?s another good argument in favor of -framework Accelerate. > >>> > >>> Thanks, > >>> Pierre > >>> > >>> PS: anyone benchmarked those https://urldefense.us/v3/__https://developer.apple.com/documentation/accelerate/sparse_solvers__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SrpnDvT5g$ ? I didn?t even know they existed. > >>> > >>>> On 17 Mar 2024, at 3:06?PM, Zongze Yang > wrote: > >>>> > >>>> This Message Is From an External Sender > >>>> This message came from outside your organization. > >>>> Understood. Thank you for your advice. > >>>> > >>>> Best wishes, > >>>> Zongze > >>>> > >>>>> On 17 Mar 2024, at 22:04, Barry Smith > wrote: > >>>>> > >>>>> > >>>>> I would just avoid the --download-openblas option. The BLAS/LAPACK provided by Apple should perform fine, perhaps even better than OpenBLAS on your system. > >>>>> > >>>>> > >>>>>> On Mar 17, 2024, at 9:58?AM, Zongze Yang > wrote: > >>>>>> > >>>>>> This Message Is From an External Sender > >>>>>> This message came from outside your organization. > >>>>>> Adding the flag `--download-openblas-make-options=TARGET=GENERIC` did not resolve the issue. The same error persisted. > >>>>>> > >>>>>> Best wishes, > >>>>>> Zongze > >>>>>> > >>>>>>> On 17 Mar 2024, at 20:58, Pierre Jolivet > wrote: > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>> On 17 Mar 2024, at 1:04?PM, Zongze Yang > wrote: > >>>>>>>> > >>>>>>>> Thank you for providing the instructions. I try the first option. > >>>>>>>> > >>>>>>>> Now, the error of the configuration is related to OpenBLAS. > >>>>>>>> Add `--CFLAGS=-Wno-int-conversion` to configure command resolve this. Should this be reported to OpenBLAS? Or need to fix the configure in petsc? > >>>>>>> > >>>>>>> I see our linux-opt-arm runner is using the additional flag '--download-openblas-make-options=TARGET=GENERIC', could you maybe try to add that as well? > >>>>>>> I don?t think there is much to fix on our end, OpenBLAS has been very broken lately on arm (current version is 0.3.26 but we can?t update because there is a huge performance regression which makes the pipeline timeout). > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Pierre > >>>>>>> > >>>>>>>> > >>>>>>>> The configure.log is attached. The errors are show below: > >>>>>>>> ``` > >>>>>>>> src/lapack_wrappers.c:570:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > >>>>>>>> RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > >>>>>>>> ^~~~ > >>>>>>>> & > >>>>>>>> src/../inc/relapack.h:74:216: note: passing argument to parameter here > >>>>>>>> void RELAPACK_sgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); > >>>>>>>> ^ > >>>>>>>> src/lapack_wrappers.c:583:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > >>>>>>>> RELAPACK_dgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > >>>>>>>> ^~~~ > >>>>>>>> & > >>>>>>>> src/../inc/relapack.h:75:221: note: passing argument to parameter here > >>>>>>>> void RELAPACK_dgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); > >>>>>>>> ^ > >>>>>>>> src/lapack_wrappers.c:596:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > >>>>>>>> RELAPACK_cgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > >>>>>>>> ^~~~ > >>>>>>>> & > >>>>>>>> src/../inc/relapack.h:76:216: note: passing argument to parameter here > >>>>>>>> void RELAPACK_cgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); > >>>>>>>> ^ > >>>>>>>> src/lapack_wrappers.c:609:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > >>>>>>>> RELAPACK_zgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > >>>>>>>> ^~~~ > >>>>>>>> & > >>>>>>>> src/../inc/relapack.h:77:221: note: passing argument to parameter here > >>>>>>>> void RELAPACK_zgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); > >>>>>>>> ^ > >>>>>>>> 4 errors generated. > >>>>>>>> ``` > >>>>>>>> > >>>>>>>> Best wishes, > >>>>>>>> Zongze > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>>> On 17 Mar 2024, at 18:48, Pierre Jolivet > wrote: > >>>>>>>>> > >>>>>>>>> You need this MR https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7365__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SqG8HOUGQ$ > >>>>>>>>> main has been broken for macOS since https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7341__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9Soe8Kh_uQ$ , so the alternative is to revert to the commit prior. > >>>>>>>>> It should work either way. > >>>>>>>>> > >>>>>>>>> Thanks, > >>>>>>>>> Pierre > >>>>>>>>> > >>>>>>>>>> On 17 Mar 2024, at 11:31?AM, Zongze Yang > wrote: > >>>>>>>>>> > >>>>>>>>>> ? > >>>>>>>>>> This Message Is From an External Sender > >>>>>>>>>> This message came from outside your organization. > >>>>>>>>>> Hi, PETSc Team, > >>>>>>>>>> > >>>>>>>>>> I am trying to install petsc with the following configuration > >>>>>>>>>> ``` > >>>>>>>>>> ./configure \ > >>>>>>>>>> --download-bison \ > >>>>>>>>>> --download-mpich \ > >>>>>>>>>> --download-mpich-configure-arguments=--disable-opencl \ > >>>>>>>>>> --download-hwloc \ > >>>>>>>>>> --download-hwloc-configure-arguments=--disable-opencl \ > >>>>>>>>>> --download-openblas \ > >>>>>>>>>> --download-openblas-make-options="'USE_THREAD=0 USE_LOCKING=1 USE_OPENMP=0'" \ > >>>>>>>>>> --with-shared-libraries=1 \ > >>>>>>>>>> --with-fortran-bindings=0 \ > >>>>>>>>>> --with-zlib \ > >>>>>>>>>> LDFLAGS=-Wl,-ld_classic > >>>>>>>>>> ``` > >>>>>>>>>> > >>>>>>>>>> The log shows that > >>>>>>>>>> ``` > >>>>>>>>>> Exhausted all shared linker guesses. Could not determine how to create a shared library! > >>>>>>>>>> ``` > >>>>>>>>>> > >>>>>>>>>> I recently updated the system and Xcode, as well as homebrew. > >>>>>>>>>> > >>>>>>>>>> The configure.log is attached. > >>>>>>>>>> > >>>>>>>>>> Thanks for your attention to this matter. > >>>>>>>>>> > >>>>>>>>>> Best wishes, > >>>>>>>>>> Zongze > >>>>>>>>>> > >>> > >> > > > > From pierre at joliv.et Mon Mar 18 12:12:35 2024 From: pierre at joliv.et (Pierre Jolivet) Date: Mon, 18 Mar 2024 18:12:35 +0100 Subject: [petsc-users] Install PETSc with option `--with-shared-libraries=1` failed on MacOS In-Reply-To: References: <93DAC71E-10F8-48FC-A2E1-DA90F64129DB@gmail.com> <333E0143-AFCA-41F6-BD72-EBCF88C6BFBD@joliv.et> <7C476492-F3A6-417B-9A71-61129CC00BB2@gmail.com> <4713306B-F97E-4853-9E86-20BA4C818810@joliv.et> <8178ED7E-487A-432C-9D9B-08A7707CCF87@gmail.com> <0A79AA1A-F2D6-4415-B668-CB7323B3B3F8@petsc.dev> <8ef9433e-6061-2d88-970f-bd492a82bc0f@mcs.anl.gov> Message-ID: <48DE8BA9-58A2-4607-841E-098895F94157@joliv.et> > On 18 Mar 2024, at 5:13?PM, Satish Balay via petsc-users wrote: > > Ah - the compiler did flag code bugs. > >> (current version is 0.3.26 but we can?t update because there is a huge performance regression which makes the pipeline timeout) > > maybe we should retry - updating to the latest snapshot and see if this issue persists. Well, that?s easy to see it is _still_ broken: https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/jobs/6419779589__;!!G_uCfscf7eWS!cZG6l8dQlL2q2LEgYiQw4bVE64zferDGxmonm_Z2I-6VXhae4u8oQiPv0BSGhXpi3y27-tKR-5wxh9MDGerWMQ$ The infamous gcc segfault that can?t let us run the pipeline, but that builds fine when it?s you that connect to the machine (I bothered you about this a couple of months ago in case you don?t remember, see https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7143__;!!G_uCfscf7eWS!cZG6l8dQlL2q2LEgYiQw4bVE64zferDGxmonm_Z2I-6VXhae4u8oQiPv0BSGhXpi3y27-tKR-5wxh9NMU62L0Q$ ). Thanks, Pierre > > Satish > > On Mon, 18 Mar 2024, Zongze Yang wrote: > >> The issue of openblas was resolved by this pr https://urldefense.us/v3/__https://github.com/OpenMathLib/OpenBLAS/pull/4565__;!!G_uCfscf7eWS!b09n5clcTFuLceLY_9KfqtSsgmmCIBLFbqciRVCKvnvFw9zTaNF8ssK0MiQlBOXUJe7H88nl-7ExdfhB-cMXLQ2d$ >> >> Best wishes, >> Zongze >> >>> On 18 Mar 2024, at 00:50, Zongze Yang wrote: >>> >>> It can be resolved by adding CFLAGS=-Wno-int-conversion. Perhaps the default behaviour of the new version compiler has been changed? >>> >>> Best wishes, >>> Zongze >>>> On 18 Mar 2024, at 00:23, Satish Balay wrote: >>>> >>>> Hm - I just tried a build with balay/xcode15-mpich - and that goes through fine for me. So don't know what the difference here is. >>>> >>>> One difference is - I have a slightly older xcode. However your compiler appears to behave as using -Werror. Perhaps CFLAGS=-Wno-int-conversion will help here? >>>> >>>> Satish >>>> >>>> ---- >>>> Executing: gcc --version >>>> stdout: >>>> Apple clang version 15.0.0 (clang-1500.3.9.4) >>>> >>>> Executing: /Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/bin/mpicc -show >>>> stdout: gcc -fPIC -fno-stack-check -Qunused-arguments -g -O0 -Wno-implicit-function-declaration -fno-common -I/Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/include -L/Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/lib -lmpi -lpmpi >>>> >>>> /Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/bin/mpicc -O2 -DMAX_STACK_ALLOC=2048 -Wall -DF_INTERFACE_GFORT -fPIC -DNO_WARMUP -DMAX_CPU_NUMBER=12 -DMAX_PARALLEL_NUMBER=1 -DBUILD_SINGLE=1 -DBUILD_DOUBLE=1 -DBUILD_COMPLEX=1 -DBUILD_COMPLEX16=1 -DVERSION=\"0.3.21\" -march=armv8-a -UASMNAME -UASMFNAME -UNAME -UCNAME -UCHAR_NAME -UCHAR_CNAME -DASMNAME=_lapack_wrappers -DASMFNAME=_lapack_wrappers_ -DNAME=lapack_wrappers_ -DCNAME=lapack_wrappers -DCHAR_NAME=\"lapack_wrappers_\" -DCHAR_CNAME=\"lapack_wrappers\" -DNO_AFFINITY -I.. -c src/lapack_wrappers.c -o src/lapack_wrappers.o >>>> src/lapack_wrappers.c:570:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>> RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>> ^~~~ >>>> & >>>> >>>> vs: >>>> Executing: gcc --version >>>> stdout: >>>> Apple clang version 15.0.0 (clang-1500.1.0.2.5) >>>> >>>> Executing: /Users/balay/petsc/arch-darwin-c-debug/bin/mpicc -show >>>> stdout: gcc -fPIC -fno-stack-check -Qunused-arguments -g -O0 -Wno-implicit-function-declaration -fno-common -I/Users/balay/petsc/arch-darwin-c-debug/include -L/Users/balay/petsc/arch-darwin-c-debug/lib -lmpi -lpmpi >>>> >>>> >>>> /Users/balay/petsc/arch-darwin-c-debug/bin/mpicc -O2 -DMAX_STACK_ALLOC=2048 -Wall -DF_INTERFACE_GFORT -fPIC -DNO_WARMUP -DMAX_CPU_NUMBER=24 -DMAX_PARALLEL_NUMBER=1 -DBUILD_SINGLE=1 -DBUILD_DOUBLE=1 -DBUILD_COMPLEX=1 -DBUILD_COMPLEX16=1 -DVERSION=\"0.3.21\" -march=armv8-a -UASMNAME -UASMFNAME -UNAME -UCNAME -UCHAR_NAME -UCHAR_CNAME -DASMNAME=_lapack_wrappers -DASMFNAME=_lapack_wrappers_ -DNAME=lapack_wrappers_ -DCNAME=lapack_wrappers -DCHAR_NAME=\"lapack_wrappers_\" -DCHAR_CNAME=\"lapack_wrappers\" -DNO_AFFINITY -I.. -c src/lapack_wrappers.c -o src/lapack_wrappers.o >>>> src/lapack_wrappers.c:570:81: warning: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>> RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>> ^~~~ >>>> & >>>> >>>> >>>> >>>> >>>> On Sun, 17 Mar 2024, Pierre Jolivet wrote: >>>> >>>>> Ah, my bad, I misread linux-opt-arm as a macOS runner, no wonder the option is not helping? >>>>> Take Barry?s advice. >>>>> Furthermore, it looks like OpenBLAS people are steering in the opposite direction as us, by forcing the use of ld-classic https://urldefense.us/v3/__https://github.com/OpenMathLib/OpenBLAS/commit/103d6f4e42fbe532ae4ea48e8d90d7d792bc93d2__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SrazFoooQ$ , so that?s another good argument in favor of -framework Accelerate. >>>>> >>>>> Thanks, >>>>> Pierre >>>>> >>>>> PS: anyone benchmarked those https://urldefense.us/v3/__https://developer.apple.com/documentation/accelerate/sparse_solvers__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SrpnDvT5g$ ? I didn?t even know they existed. >>>>> >>>>>> On 17 Mar 2024, at 3:06?PM, Zongze Yang > wrote: >>>>>> >>>>>> This Message Is From an External Sender >>>>>> This message came from outside your organization. >>>>>> Understood. Thank you for your advice. >>>>>> >>>>>> Best wishes, >>>>>> Zongze >>>>>> >>>>>>> On 17 Mar 2024, at 22:04, Barry Smith > wrote: >>>>>>> >>>>>>> >>>>>>> I would just avoid the --download-openblas option. The BLAS/LAPACK provided by Apple should perform fine, perhaps even better than OpenBLAS on your system. >>>>>>> >>>>>>> >>>>>>>> On Mar 17, 2024, at 9:58?AM, Zongze Yang > wrote: >>>>>>>> >>>>>>>> This Message Is From an External Sender >>>>>>>> This message came from outside your organization. >>>>>>>> Adding the flag `--download-openblas-make-options=TARGET=GENERIC` did not resolve the issue. The same error persisted. >>>>>>>> >>>>>>>> Best wishes, >>>>>>>> Zongze >>>>>>>> >>>>>>>>> On 17 Mar 2024, at 20:58, Pierre Jolivet > wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> On 17 Mar 2024, at 1:04?PM, Zongze Yang > wrote: >>>>>>>>>> >>>>>>>>>> Thank you for providing the instructions. I try the first option. >>>>>>>>>> >>>>>>>>>> Now, the error of the configuration is related to OpenBLAS. >>>>>>>>>> Add `--CFLAGS=-Wno-int-conversion` to configure command resolve this. Should this be reported to OpenBLAS? Or need to fix the configure in petsc? >>>>>>>>> >>>>>>>>> I see our linux-opt-arm runner is using the additional flag '--download-openblas-make-options=TARGET=GENERIC', could you maybe try to add that as well? >>>>>>>>> I don?t think there is much to fix on our end, OpenBLAS has been very broken lately on arm (current version is 0.3.26 but we can?t update because there is a huge performance regression which makes the pipeline timeout). >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Pierre >>>>>>>>> >>>>>>>>>> >>>>>>>>>> The configure.log is attached. The errors are show below: >>>>>>>>>> ``` >>>>>>>>>> src/lapack_wrappers.c:570:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>>>>>>>> RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>>>>>>>> ^~~~ >>>>>>>>>> & >>>>>>>>>> src/../inc/relapack.h:74:216: note: passing argument to parameter here >>>>>>>>>> void RELAPACK_sgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); >>>>>>>>>> ^ >>>>>>>>>> src/lapack_wrappers.c:583:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>>>>>>>> RELAPACK_dgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>>>>>>>> ^~~~ >>>>>>>>>> & >>>>>>>>>> src/../inc/relapack.h:75:221: note: passing argument to parameter here >>>>>>>>>> void RELAPACK_dgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); >>>>>>>>>> ^ >>>>>>>>>> src/lapack_wrappers.c:596:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>>>>>>>> RELAPACK_cgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>>>>>>>> ^~~~ >>>>>>>>>> & >>>>>>>>>> src/../inc/relapack.h:76:216: note: passing argument to parameter here >>>>>>>>>> void RELAPACK_cgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); >>>>>>>>>> ^ >>>>>>>>>> src/lapack_wrappers.c:609:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>>>>>>>> RELAPACK_zgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>>>>>>>> ^~~~ >>>>>>>>>> & >>>>>>>>>> src/../inc/relapack.h:77:221: note: passing argument to parameter here >>>>>>>>>> void RELAPACK_zgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); >>>>>>>>>> ^ >>>>>>>>>> 4 errors generated. >>>>>>>>>> ``` >>>>>>>>>> >>>>>>>>>> Best wishes, >>>>>>>>>> Zongze >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On 17 Mar 2024, at 18:48, Pierre Jolivet > wrote: >>>>>>>>>>> >>>>>>>>>>> You need this MR https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7365__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SqG8HOUGQ$ >>>>>>>>>>> main has been broken for macOS since https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7341__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9Soe8Kh_uQ$ , so the alternative is to revert to the commit prior. >>>>>>>>>>> It should work either way. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Pierre >>>>>>>>>>> >>>>>>>>>>>> On 17 Mar 2024, at 11:31?AM, Zongze Yang > wrote: >>>>>>>>>>>> >>>>>>>>>>>> ? >>>>>>>>>>>> This Message Is From an External Sender >>>>>>>>>>>> This message came from outside your organization. >>>>>>>>>>>> Hi, PETSc Team, >>>>>>>>>>>> >>>>>>>>>>>> I am trying to install petsc with the following configuration >>>>>>>>>>>> ``` >>>>>>>>>>>> ./configure \ >>>>>>>>>>>> --download-bison \ >>>>>>>>>>>> --download-mpich \ >>>>>>>>>>>> --download-mpich-configure-arguments=--disable-opencl \ >>>>>>>>>>>> --download-hwloc \ >>>>>>>>>>>> --download-hwloc-configure-arguments=--disable-opencl \ >>>>>>>>>>>> --download-openblas \ >>>>>>>>>>>> --download-openblas-make-options="'USE_THREAD=0 USE_LOCKING=1 USE_OPENMP=0'" \ >>>>>>>>>>>> --with-shared-libraries=1 \ >>>>>>>>>>>> --with-fortran-bindings=0 \ >>>>>>>>>>>> --with-zlib \ >>>>>>>>>>>> LDFLAGS=-Wl,-ld_classic >>>>>>>>>>>> ``` >>>>>>>>>>>> >>>>>>>>>>>> The log shows that >>>>>>>>>>>> ``` >>>>>>>>>>>> Exhausted all shared linker guesses. Could not determine how to create a shared library! >>>>>>>>>>>> ``` >>>>>>>>>>>> >>>>>>>>>>>> I recently updated the system and Xcode, as well as homebrew. >>>>>>>>>>>> >>>>>>>>>>>> The configure.log is attached. >>>>>>>>>>>> >>>>>>>>>>>> Thanks for your attention to this matter. >>>>>>>>>>>> >>>>>>>>>>>> Best wishes, >>>>>>>>>>>> Zongze >>>>>>>>>>>> >>>>> >>>> >>> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Mon Mar 18 12:35:21 2024 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 18 Mar 2024 12:35:21 -0500 (CDT) Subject: [petsc-users] Install PETSc with option `--with-shared-libraries=1` failed on MacOS In-Reply-To: <48DE8BA9-58A2-4607-841E-098895F94157@joliv.et> References: <93DAC71E-10F8-48FC-A2E1-DA90F64129DB@gmail.com> <333E0143-AFCA-41F6-BD72-EBCF88C6BFBD@joliv.et> <7C476492-F3A6-417B-9A71-61129CC00BB2@gmail.com> <4713306B-F97E-4853-9E86-20BA4C818810@joliv.et> <8178ED7E-487A-432C-9D9B-08A7707CCF87@gmail.com> <0A79AA1A-F2D6-4415-B668-CB7323B3B3F8@petsc.dev> <8ef9433e-6061-2d88-970f-bd492a82bc0f@mcs.anl.gov> <48DE8BA9-58A2-4607-841E-098895F94157@joliv.et> Message-ID: <51474162-f43c-c16e-dd4c-d52a2c23e291@mcs.anl.gov> On Mon, 18 Mar 2024, Pierre Jolivet wrote: > > > > On 18 Mar 2024, at 5:13?PM, Satish Balay via petsc-users wrote: > > > > Ah - the compiler did flag code bugs. > > > >> (current version is 0.3.26 but we can?t update because there is a huge performance regression which makes the pipeline timeout) > > > > maybe we should retry - updating to the latest snapshot and see if this issue persists. > > Well, that?s easy to see it is _still_ broken: https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/jobs/6419779589__;!!G_uCfscf7eWS!f4svx7Rv1mmcLfy5l0C9bXXrw9gwb49ykkTb28IAtZW0VgZ8vgdD8exUOZSL0TCEqqP5X-p-0ll6TetPkw$ > The infamous gcc segfault that can?t let us run the pipeline, but that builds fine when it?s you that connect to the machine (I bothered you about this a couple of months ago in case you don?t remember, see https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7143__;!!G_uCfscf7eWS!f4svx7Rv1mmcLfy5l0C9bXXrw9gwb49ykkTb28IAtZW0VgZ8vgdD8exUOZSL0TCEqqP5X-p-0llrLiE4GQ$ ). > make[2]: *** [../../Makefile.tail:46: libs] Bus error (core dumped) Ah - ok - that's a strange error. I'm not sure how to debug it. [it fails when the build is invoked from configure - but not when its invoked directly from bash/shell.] Satish > > Thanks, > Pierre > > > > > Satish > > > > On Mon, 18 Mar 2024, Zongze Yang wrote: > > > >> The issue of openblas was resolved by this pr https://urldefense.us/v3/__https://github.com/OpenMathLib/OpenBLAS/pull/4565__;!!G_uCfscf7eWS!b09n5clcTFuLceLY_9KfqtSsgmmCIBLFbqciRVCKvnvFw9zTaNF8ssK0MiQlBOXUJe7H88nl-7ExdfhB-cMXLQ2d$ > >> > >> Best wishes, > >> Zongze > >> > >>> On 18 Mar 2024, at 00:50, Zongze Yang wrote: > >>> > >>> It can be resolved by adding CFLAGS=-Wno-int-conversion. Perhaps the default behaviour of the new version compiler has been changed? > >>> > >>> Best wishes, > >>> Zongze > >>>> On 18 Mar 2024, at 00:23, Satish Balay wrote: > >>>> > >>>> Hm - I just tried a build with balay/xcode15-mpich - and that goes through fine for me. So don't know what the difference here is. > >>>> > >>>> One difference is - I have a slightly older xcode. However your compiler appears to behave as using -Werror. Perhaps CFLAGS=-Wno-int-conversion will help here? > >>>> > >>>> Satish > >>>> > >>>> ---- > >>>> Executing: gcc --version > >>>> stdout: > >>>> Apple clang version 15.0.0 (clang-1500.3.9.4) > >>>> > >>>> Executing: /Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/bin/mpicc -show > >>>> stdout: gcc -fPIC -fno-stack-check -Qunused-arguments -g -O0 -Wno-implicit-function-declaration -fno-common -I/Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/include -L/Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/lib -lmpi -lpmpi > >>>> > >>>> /Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/bin/mpicc -O2 -DMAX_STACK_ALLOC=2048 -Wall -DF_INTERFACE_GFORT -fPIC -DNO_WARMUP -DMAX_CPU_NUMBER=12 -DMAX_PARALLEL_NUMBER=1 -DBUILD_SINGLE=1 -DBUILD_DOUBLE=1 -DBUILD_COMPLEX=1 -DBUILD_COMPLEX16=1 -DVERSION=\"0.3.21\" -march=armv8-a -UASMNAME -UASMFNAME -UNAME -UCNAME -UCHAR_NAME -UCHAR_CNAME -DASMNAME=_lapack_wrappers -DASMFNAME=_lapack_wrappers_ -DNAME=lapack_wrappers_ -DCNAME=lapack_wrappers -DCHAR_NAME=\"lapack_wrappers_\" -DCHAR_CNAME=\"lapack_wrappers\" -DNO_AFFINITY -I.. -c src/lapack_wrappers.c -o src/lapack_wrappers.o > >>>> src/lapack_wrappers.c:570:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > >>>> RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > >>>> ^~~~ > >>>> & > >>>> > >>>> vs: > >>>> Executing: gcc --version > >>>> stdout: > >>>> Apple clang version 15.0.0 (clang-1500.1.0.2.5) > >>>> > >>>> Executing: /Users/balay/petsc/arch-darwin-c-debug/bin/mpicc -show > >>>> stdout: gcc -fPIC -fno-stack-check -Qunused-arguments -g -O0 -Wno-implicit-function-declaration -fno-common -I/Users/balay/petsc/arch-darwin-c-debug/include -L/Users/balay/petsc/arch-darwin-c-debug/lib -lmpi -lpmpi > >>>> > >>>> > >>>> /Users/balay/petsc/arch-darwin-c-debug/bin/mpicc -O2 -DMAX_STACK_ALLOC=2048 -Wall -DF_INTERFACE_GFORT -fPIC -DNO_WARMUP -DMAX_CPU_NUMBER=24 -DMAX_PARALLEL_NUMBER=1 -DBUILD_SINGLE=1 -DBUILD_DOUBLE=1 -DBUILD_COMPLEX=1 -DBUILD_COMPLEX16=1 -DVERSION=\"0.3.21\" -march=armv8-a -UASMNAME -UASMFNAME -UNAME -UCNAME -UCHAR_NAME -UCHAR_CNAME -DASMNAME=_lapack_wrappers -DASMFNAME=_lapack_wrappers_ -DNAME=lapack_wrappers_ -DCNAME=lapack_wrappers -DCHAR_NAME=\"lapack_wrappers_\" -DCHAR_CNAME=\"lapack_wrappers\" -DNO_AFFINITY -I.. -c src/lapack_wrappers.c -o src/lapack_wrappers.o > >>>> src/lapack_wrappers.c:570:81: warning: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > >>>> RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > >>>> ^~~~ > >>>> & > >>>> > >>>> > >>>> > >>>> > >>>> On Sun, 17 Mar 2024, Pierre Jolivet wrote: > >>>> > >>>>> Ah, my bad, I misread linux-opt-arm as a macOS runner, no wonder the option is not helping? > >>>>> Take Barry?s advice. > >>>>> Furthermore, it looks like OpenBLAS people are steering in the opposite direction as us, by forcing the use of ld-classic https://urldefense.us/v3/__https://github.com/OpenMathLib/OpenBLAS/commit/103d6f4e42fbe532ae4ea48e8d90d7d792bc93d2__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SrazFoooQ$ , so that?s another good argument in favor of -framework Accelerate. > >>>>> > >>>>> Thanks, > >>>>> Pierre > >>>>> > >>>>> PS: anyone benchmarked those https://urldefense.us/v3/__https://developer.apple.com/documentation/accelerate/sparse_solvers__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SrpnDvT5g$ ? I didn?t even know they existed. > >>>>> > >>>>>> On 17 Mar 2024, at 3:06?PM, Zongze Yang > wrote: > >>>>>> > >>>>>> This Message Is From an External Sender > >>>>>> This message came from outside your organization. > >>>>>> Understood. Thank you for your advice. > >>>>>> > >>>>>> Best wishes, > >>>>>> Zongze > >>>>>> > >>>>>>> On 17 Mar 2024, at 22:04, Barry Smith > wrote: > >>>>>>> > >>>>>>> > >>>>>>> I would just avoid the --download-openblas option. The BLAS/LAPACK provided by Apple should perform fine, perhaps even better than OpenBLAS on your system. > >>>>>>> > >>>>>>> > >>>>>>>> On Mar 17, 2024, at 9:58?AM, Zongze Yang > wrote: > >>>>>>>> > >>>>>>>> This Message Is From an External Sender > >>>>>>>> This message came from outside your organization. > >>>>>>>> Adding the flag `--download-openblas-make-options=TARGET=GENERIC` did not resolve the issue. The same error persisted. > >>>>>>>> > >>>>>>>> Best wishes, > >>>>>>>> Zongze > >>>>>>>> > >>>>>>>>> On 17 Mar 2024, at 20:58, Pierre Jolivet > wrote: > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> On 17 Mar 2024, at 1:04?PM, Zongze Yang > wrote: > >>>>>>>>>> > >>>>>>>>>> Thank you for providing the instructions. I try the first option. > >>>>>>>>>> > >>>>>>>>>> Now, the error of the configuration is related to OpenBLAS. > >>>>>>>>>> Add `--CFLAGS=-Wno-int-conversion` to configure command resolve this. Should this be reported to OpenBLAS? Or need to fix the configure in petsc? > >>>>>>>>> > >>>>>>>>> I see our linux-opt-arm runner is using the additional flag '--download-openblas-make-options=TARGET=GENERIC', could you maybe try to add that as well? > >>>>>>>>> I don?t think there is much to fix on our end, OpenBLAS has been very broken lately on arm (current version is 0.3.26 but we can?t update because there is a huge performance regression which makes the pipeline timeout). > >>>>>>>>> > >>>>>>>>> Thanks, > >>>>>>>>> Pierre > >>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> The configure.log is attached. The errors are show below: > >>>>>>>>>> ``` > >>>>>>>>>> src/lapack_wrappers.c:570:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > >>>>>>>>>> RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > >>>>>>>>>> ^~~~ > >>>>>>>>>> & > >>>>>>>>>> src/../inc/relapack.h:74:216: note: passing argument to parameter here > >>>>>>>>>> void RELAPACK_sgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); > >>>>>>>>>> ^ > >>>>>>>>>> src/lapack_wrappers.c:583:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > >>>>>>>>>> RELAPACK_dgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > >>>>>>>>>> ^~~~ > >>>>>>>>>> & > >>>>>>>>>> src/../inc/relapack.h:75:221: note: passing argument to parameter here > >>>>>>>>>> void RELAPACK_dgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); > >>>>>>>>>> ^ > >>>>>>>>>> src/lapack_wrappers.c:596:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > >>>>>>>>>> RELAPACK_cgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > >>>>>>>>>> ^~~~ > >>>>>>>>>> & > >>>>>>>>>> src/../inc/relapack.h:76:216: note: passing argument to parameter here > >>>>>>>>>> void RELAPACK_cgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); > >>>>>>>>>> ^ > >>>>>>>>>> src/lapack_wrappers.c:609:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > >>>>>>>>>> RELAPACK_zgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > >>>>>>>>>> ^~~~ > >>>>>>>>>> & > >>>>>>>>>> src/../inc/relapack.h:77:221: note: passing argument to parameter here > >>>>>>>>>> void RELAPACK_zgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); > >>>>>>>>>> ^ > >>>>>>>>>> 4 errors generated. > >>>>>>>>>> ``` > >>>>>>>>>> > >>>>>>>>>> Best wishes, > >>>>>>>>>> Zongze > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>>> On 17 Mar 2024, at 18:48, Pierre Jolivet > wrote: > >>>>>>>>>>> > >>>>>>>>>>> You need this MR https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7365__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SqG8HOUGQ$ > >>>>>>>>>>> main has been broken for macOS since https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7341__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9Soe8Kh_uQ$ , so the alternative is to revert to the commit prior. > >>>>>>>>>>> It should work either way. > >>>>>>>>>>> > >>>>>>>>>>> Thanks, > >>>>>>>>>>> Pierre > >>>>>>>>>>> > >>>>>>>>>>>> On 17 Mar 2024, at 11:31?AM, Zongze Yang > wrote: > >>>>>>>>>>>> > >>>>>>>>>>>> ? > >>>>>>>>>>>> This Message Is From an External Sender > >>>>>>>>>>>> This message came from outside your organization. > >>>>>>>>>>>> Hi, PETSc Team, > >>>>>>>>>>>> > >>>>>>>>>>>> I am trying to install petsc with the following configuration > >>>>>>>>>>>> ``` > >>>>>>>>>>>> ./configure \ > >>>>>>>>>>>> --download-bison \ > >>>>>>>>>>>> --download-mpich \ > >>>>>>>>>>>> --download-mpich-configure-arguments=--disable-opencl \ > >>>>>>>>>>>> --download-hwloc \ > >>>>>>>>>>>> --download-hwloc-configure-arguments=--disable-opencl \ > >>>>>>>>>>>> --download-openblas \ > >>>>>>>>>>>> --download-openblas-make-options="'USE_THREAD=0 USE_LOCKING=1 USE_OPENMP=0'" \ > >>>>>>>>>>>> --with-shared-libraries=1 \ > >>>>>>>>>>>> --with-fortran-bindings=0 \ > >>>>>>>>>>>> --with-zlib \ > >>>>>>>>>>>> LDFLAGS=-Wl,-ld_classic > >>>>>>>>>>>> ``` > >>>>>>>>>>>> > >>>>>>>>>>>> The log shows that > >>>>>>>>>>>> ``` > >>>>>>>>>>>> Exhausted all shared linker guesses. Could not determine how to create a shared library! > >>>>>>>>>>>> ``` > >>>>>>>>>>>> > >>>>>>>>>>>> I recently updated the system and Xcode, as well as homebrew. > >>>>>>>>>>>> > >>>>>>>>>>>> The configure.log is attached. > >>>>>>>>>>>> > >>>>>>>>>>>> Thanks for your attention to this matter. > >>>>>>>>>>>> > >>>>>>>>>>>> Best wishes, > >>>>>>>>>>>> Zongze > >>>>>>>>>>>> > >>>>> > >>>> > >>> > >> > >> > > From balay at mcs.anl.gov Mon Mar 18 13:59:34 2024 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 18 Mar 2024 13:59:34 -0500 (CDT) Subject: [petsc-users] Install PETSc with option `--with-shared-libraries=1` failed on MacOS In-Reply-To: <51474162-f43c-c16e-dd4c-d52a2c23e291@mcs.anl.gov> References: <93DAC71E-10F8-48FC-A2E1-DA90F64129DB@gmail.com> <333E0143-AFCA-41F6-BD72-EBCF88C6BFBD@joliv.et> <7C476492-F3A6-417B-9A71-61129CC00BB2@gmail.com> <4713306B-F97E-4853-9E86-20BA4C818810@joliv.et> <8178ED7E-487A-432C-9D9B-08A7707CCF87@gmail.com> <0A79AA1A-F2D6-4415-B668-CB7323B3B3F8@petsc.dev> <8ef9433e-6061-2d88-970f-bd492a82bc0f@mcs.anl.gov> <48DE8BA9-58A2-4607-841E-098895F94157@joliv.et> <51474162-f43c-c16e-dd4c-d52a2c23e291@mcs.anl.gov> Message-ID: <5d37040f-6c6e-b318-dbdd-944c8e9b3881@mcs.anl.gov> On Mon, 18 Mar 2024, Satish Balay via petsc-users wrote: > On Mon, 18 Mar 2024, Pierre Jolivet wrote: > > > > > > > > On 18 Mar 2024, at 5:13?PM, Satish Balay via petsc-users wrote: > > > > > > Ah - the compiler did flag code bugs. > > > > > >> (current version is 0.3.26 but we can?t update because there is a huge performance regression which makes the pipeline timeout) > > > > > > maybe we should retry - updating to the latest snapshot and see if this issue persists. > > > > Well, that?s easy to see it is _still_ broken: https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/jobs/6419779589__;!!G_uCfscf7eWS!f4svx7Rv1mmcLfy5l0C9bXXrw9gwb49ykkTb28IAtZW0VgZ8vgdD8exUOZSL0TCEqqP5X-p-0ll6TetPkw$ > > The infamous gcc segfault that can?t let us run the pipeline, but that builds fine when it?s you that connect to the machine (I bothered you about this a couple of months ago in case you don?t remember, see https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7143__;!!G_uCfscf7eWS!f4svx7Rv1mmcLfy5l0C9bXXrw9gwb49ykkTb28IAtZW0VgZ8vgdD8exUOZSL0TCEqqP5X-p-0llrLiE4GQ$ ). > > > make[2]: *** [../../Makefile.tail:46: libs] Bus error (core dumped) > > Ah - ok - that's a strange error. I'm not sure how to debug it. [it fails when the build is invoked from configure - but not when its invoked directly from bash/shell.] Pushed a potential workaround to jolivet/test-openblas Note: The failure comes up on same OS (Fedora 39) on X64 aswell. Satish > > Satish > > > > > Thanks, > > Pierre > > > > > > > > Satish > > > > > > On Mon, 18 Mar 2024, Zongze Yang wrote: > > > > > >> The issue of openblas was resolved by this pr https://urldefense.us/v3/__https://github.com/OpenMathLib/OpenBLAS/pull/4565__;!!G_uCfscf7eWS!b09n5clcTFuLceLY_9KfqtSsgmmCIBLFbqciRVCKvnvFw9zTaNF8ssK0MiQlBOXUJe7H88nl-7ExdfhB-cMXLQ2d$ > > >> > > >> Best wishes, > > >> Zongze > > >> > > >>> On 18 Mar 2024, at 00:50, Zongze Yang wrote: > > >>> > > >>> It can be resolved by adding CFLAGS=-Wno-int-conversion. Perhaps the default behaviour of the new version compiler has been changed? > > >>> > > >>> Best wishes, > > >>> Zongze > > >>>> On 18 Mar 2024, at 00:23, Satish Balay wrote: > > >>>> > > >>>> Hm - I just tried a build with balay/xcode15-mpich - and that goes through fine for me. So don't know what the difference here is. > > >>>> > > >>>> One difference is - I have a slightly older xcode. However your compiler appears to behave as using -Werror. Perhaps CFLAGS=-Wno-int-conversion will help here? > > >>>> > > >>>> Satish > > >>>> > > >>>> ---- > > >>>> Executing: gcc --version > > >>>> stdout: > > >>>> Apple clang version 15.0.0 (clang-1500.3.9.4) > > >>>> > > >>>> Executing: /Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/bin/mpicc -show > > >>>> stdout: gcc -fPIC -fno-stack-check -Qunused-arguments -g -O0 -Wno-implicit-function-declaration -fno-common -I/Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/include -L/Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/lib -lmpi -lpmpi > > >>>> > > >>>> /Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/bin/mpicc -O2 -DMAX_STACK_ALLOC=2048 -Wall -DF_INTERFACE_GFORT -fPIC -DNO_WARMUP -DMAX_CPU_NUMBER=12 -DMAX_PARALLEL_NUMBER=1 -DBUILD_SINGLE=1 -DBUILD_DOUBLE=1 -DBUILD_COMPLEX=1 -DBUILD_COMPLEX16=1 -DVERSION=\"0.3.21\" -march=armv8-a -UASMNAME -UASMFNAME -UNAME -UCNAME -UCHAR_NAME -UCHAR_CNAME -DASMNAME=_lapack_wrappers -DASMFNAME=_lapack_wrappers_ -DNAME=lapack_wrappers_ -DCNAME=lapack_wrappers -DCHAR_NAME=\"lapack_wrappers_\" -DCHAR_CNAME=\"lapack_wrappers\" -DNO_AFFINITY -I.. -c src/lapack_wrappers.c -o src/lapack_wrappers.o > > >>>> src/lapack_wrappers.c:570:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > > >>>> RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > > >>>> ^~~~ > > >>>> & > > >>>> > > >>>> vs: > > >>>> Executing: gcc --version > > >>>> stdout: > > >>>> Apple clang version 15.0.0 (clang-1500.1.0.2.5) > > >>>> > > >>>> Executing: /Users/balay/petsc/arch-darwin-c-debug/bin/mpicc -show > > >>>> stdout: gcc -fPIC -fno-stack-check -Qunused-arguments -g -O0 -Wno-implicit-function-declaration -fno-common -I/Users/balay/petsc/arch-darwin-c-debug/include -L/Users/balay/petsc/arch-darwin-c-debug/lib -lmpi -lpmpi > > >>>> > > >>>> > > >>>> /Users/balay/petsc/arch-darwin-c-debug/bin/mpicc -O2 -DMAX_STACK_ALLOC=2048 -Wall -DF_INTERFACE_GFORT -fPIC -DNO_WARMUP -DMAX_CPU_NUMBER=24 -DMAX_PARALLEL_NUMBER=1 -DBUILD_SINGLE=1 -DBUILD_DOUBLE=1 -DBUILD_COMPLEX=1 -DBUILD_COMPLEX16=1 -DVERSION=\"0.3.21\" -march=armv8-a -UASMNAME -UASMFNAME -UNAME -UCNAME -UCHAR_NAME -UCHAR_CNAME -DASMNAME=_lapack_wrappers -DASMFNAME=_lapack_wrappers_ -DNAME=lapack_wrappers_ -DCNAME=lapack_wrappers -DCHAR_NAME=\"lapack_wrappers_\" -DCHAR_CNAME=\"lapack_wrappers\" -DNO_AFFINITY -I.. -c src/lapack_wrappers.c -o src/lapack_wrappers.o > > >>>> src/lapack_wrappers.c:570:81: warning: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > > >>>> RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > > >>>> ^~~~ > > >>>> & > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> On Sun, 17 Mar 2024, Pierre Jolivet wrote: > > >>>> > > >>>>> Ah, my bad, I misread linux-opt-arm as a macOS runner, no wonder the option is not helping? > > >>>>> Take Barry?s advice. > > >>>>> Furthermore, it looks like OpenBLAS people are steering in the opposite direction as us, by forcing the use of ld-classic https://urldefense.us/v3/__https://github.com/OpenMathLib/OpenBLAS/commit/103d6f4e42fbe532ae4ea48e8d90d7d792bc93d2__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SrazFoooQ$ , so that?s another good argument in favor of -framework Accelerate. > > >>>>> > > >>>>> Thanks, > > >>>>> Pierre > > >>>>> > > >>>>> PS: anyone benchmarked those https://urldefense.us/v3/__https://developer.apple.com/documentation/accelerate/sparse_solvers__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SrpnDvT5g$ ? I didn?t even know they existed. > > >>>>> > > >>>>>> On 17 Mar 2024, at 3:06?PM, Zongze Yang > wrote: > > >>>>>> > > >>>>>> This Message Is From an External Sender > > >>>>>> This message came from outside your organization. > > >>>>>> Understood. Thank you for your advice. > > >>>>>> > > >>>>>> Best wishes, > > >>>>>> Zongze > > >>>>>> > > >>>>>>> On 17 Mar 2024, at 22:04, Barry Smith > wrote: > > >>>>>>> > > >>>>>>> > > >>>>>>> I would just avoid the --download-openblas option. The BLAS/LAPACK provided by Apple should perform fine, perhaps even better than OpenBLAS on your system. > > >>>>>>> > > >>>>>>> > > >>>>>>>> On Mar 17, 2024, at 9:58?AM, Zongze Yang > wrote: > > >>>>>>>> > > >>>>>>>> This Message Is From an External Sender > > >>>>>>>> This message came from outside your organization. > > >>>>>>>> Adding the flag `--download-openblas-make-options=TARGET=GENERIC` did not resolve the issue. The same error persisted. > > >>>>>>>> > > >>>>>>>> Best wishes, > > >>>>>>>> Zongze > > >>>>>>>> > > >>>>>>>>> On 17 Mar 2024, at 20:58, Pierre Jolivet > wrote: > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>>> On 17 Mar 2024, at 1:04?PM, Zongze Yang > wrote: > > >>>>>>>>>> > > >>>>>>>>>> Thank you for providing the instructions. I try the first option. > > >>>>>>>>>> > > >>>>>>>>>> Now, the error of the configuration is related to OpenBLAS. > > >>>>>>>>>> Add `--CFLAGS=-Wno-int-conversion` to configure command resolve this. Should this be reported to OpenBLAS? Or need to fix the configure in petsc? > > >>>>>>>>> > > >>>>>>>>> I see our linux-opt-arm runner is using the additional flag '--download-openblas-make-options=TARGET=GENERIC', could you maybe try to add that as well? > > >>>>>>>>> I don?t think there is much to fix on our end, OpenBLAS has been very broken lately on arm (current version is 0.3.26 but we can?t update because there is a huge performance regression which makes the pipeline timeout). > > >>>>>>>>> > > >>>>>>>>> Thanks, > > >>>>>>>>> Pierre > > >>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> The configure.log is attached. The errors are show below: > > >>>>>>>>>> ``` > > >>>>>>>>>> src/lapack_wrappers.c:570:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > > >>>>>>>>>> RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > > >>>>>>>>>> ^~~~ > > >>>>>>>>>> & > > >>>>>>>>>> src/../inc/relapack.h:74:216: note: passing argument to parameter here > > >>>>>>>>>> void RELAPACK_sgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); > > >>>>>>>>>> ^ > > >>>>>>>>>> src/lapack_wrappers.c:583:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > > >>>>>>>>>> RELAPACK_dgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > > >>>>>>>>>> ^~~~ > > >>>>>>>>>> & > > >>>>>>>>>> src/../inc/relapack.h:75:221: note: passing argument to parameter here > > >>>>>>>>>> void RELAPACK_dgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); > > >>>>>>>>>> ^ > > >>>>>>>>>> src/lapack_wrappers.c:596:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > > >>>>>>>>>> RELAPACK_cgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > > >>>>>>>>>> ^~~~ > > >>>>>>>>>> & > > >>>>>>>>>> src/../inc/relapack.h:76:216: note: passing argument to parameter here > > >>>>>>>>>> void RELAPACK_cgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); > > >>>>>>>>>> ^ > > >>>>>>>>>> src/lapack_wrappers.c:609:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] > > >>>>>>>>>> RELAPACK_zgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); > > >>>>>>>>>> ^~~~ > > >>>>>>>>>> & > > >>>>>>>>>> src/../inc/relapack.h:77:221: note: passing argument to parameter here > > >>>>>>>>>> void RELAPACK_zgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); > > >>>>>>>>>> ^ > > >>>>>>>>>> 4 errors generated. > > >>>>>>>>>> ``` > > >>>>>>>>>> > > >>>>>>>>>> Best wishes, > > >>>>>>>>>> Zongze > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>>> On 17 Mar 2024, at 18:48, Pierre Jolivet > wrote: > > >>>>>>>>>>> > > >>>>>>>>>>> You need this MR https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7365__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SqG8HOUGQ$ > > >>>>>>>>>>> main has been broken for macOS since https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7341__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9Soe8Kh_uQ$ , so the alternative is to revert to the commit prior. > > >>>>>>>>>>> It should work either way. > > >>>>>>>>>>> > > >>>>>>>>>>> Thanks, > > >>>>>>>>>>> Pierre > > >>>>>>>>>>> > > >>>>>>>>>>>> On 17 Mar 2024, at 11:31?AM, Zongze Yang > wrote: > > >>>>>>>>>>>> > > >>>>>>>>>>>> ? > > >>>>>>>>>>>> This Message Is From an External Sender > > >>>>>>>>>>>> This message came from outside your organization. > > >>>>>>>>>>>> Hi, PETSc Team, > > >>>>>>>>>>>> > > >>>>>>>>>>>> I am trying to install petsc with the following configuration > > >>>>>>>>>>>> ``` > > >>>>>>>>>>>> ./configure \ > > >>>>>>>>>>>> --download-bison \ > > >>>>>>>>>>>> --download-mpich \ > > >>>>>>>>>>>> --download-mpich-configure-arguments=--disable-opencl \ > > >>>>>>>>>>>> --download-hwloc \ > > >>>>>>>>>>>> --download-hwloc-configure-arguments=--disable-opencl \ > > >>>>>>>>>>>> --download-openblas \ > > >>>>>>>>>>>> --download-openblas-make-options="'USE_THREAD=0 USE_LOCKING=1 USE_OPENMP=0'" \ > > >>>>>>>>>>>> --with-shared-libraries=1 \ > > >>>>>>>>>>>> --with-fortran-bindings=0 \ > > >>>>>>>>>>>> --with-zlib \ > > >>>>>>>>>>>> LDFLAGS=-Wl,-ld_classic > > >>>>>>>>>>>> ``` > > >>>>>>>>>>>> > > >>>>>>>>>>>> The log shows that > > >>>>>>>>>>>> ``` > > >>>>>>>>>>>> Exhausted all shared linker guesses. Could not determine how to create a shared library! > > >>>>>>>>>>>> ``` > > >>>>>>>>>>>> > > >>>>>>>>>>>> I recently updated the system and Xcode, as well as homebrew. > > >>>>>>>>>>>> > > >>>>>>>>>>>> The configure.log is attached. > > >>>>>>>>>>>> > > >>>>>>>>>>>> Thanks for your attention to this matter. > > >>>>>>>>>>>> > > >>>>>>>>>>>> Best wishes, > > >>>>>>>>>>>> Zongze > > >>>>>>>>>>>> > > >>>>> > > >>>> > > >>> > > >> > > >> > > > > > From pierre at joliv.et Mon Mar 18 14:17:18 2024 From: pierre at joliv.et (Pierre Jolivet) Date: Mon, 18 Mar 2024 20:17:18 +0100 Subject: [petsc-users] Install PETSc with option `--with-shared-libraries=1` failed on MacOS In-Reply-To: <5d37040f-6c6e-b318-dbdd-944c8e9b3881@mcs.anl.gov> References: <93DAC71E-10F8-48FC-A2E1-DA90F64129DB@gmail.com> <333E0143-AFCA-41F6-BD72-EBCF88C6BFBD@joliv.et> <7C476492-F3A6-417B-9A71-61129CC00BB2@gmail.com> <4713306B-F97E-4853-9E86-20BA4C818810@joliv.et> <8178ED7E-487A-432C-9D9B-08A7707CCF87@gmail.com> <0A79AA1A-F2D6-4415-B668-CB7323B3B3F8@petsc.dev> <8ef9433e-6061-2d88-970f-bd492a82bc0f@mcs.anl.gov> <48DE8BA9-58A2-4607-841E-098895F94157@joliv.et> <51474162-f43c-c16e-dd4c-d52a2c23e291@mcs.anl.gov> <5d37040f-6c6e-b318-dbdd-944c8e9b3881@mcs.anl.gov> Message-ID: > On 18 Mar 2024, at 7:59?PM, Satish Balay via petsc-users wrote: > > On Mon, 18 Mar 2024, Satish Balay via petsc-users wrote: > >> On Mon, 18 Mar 2024, Pierre Jolivet wrote: >> >>> >>> >>>> On 18 Mar 2024, at 5:13?PM, Satish Balay via petsc-users wrote: >>>> >>>> Ah - the compiler did flag code bugs. >>>> >>>>> (current version is 0.3.26 but we can?t update because there is a huge performance regression which makes the pipeline timeout) >>>> >>>> maybe we should retry - updating to the latest snapshot and see if this issue persists. >>> >>> Well, that?s easy to see it is _still_ broken: https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/jobs/6419779589__;!!G_uCfscf7eWS!f4svx7Rv1mmcLfy5l0C9bXXrw9gwb49ykkTb28IAtZW0VgZ8vgdD8exUOZSL0TCEqqP5X-p-0ll6TetPkw$ >>> The infamous gcc segfault that can?t let us run the pipeline, but that builds fine when it?s you that connect to the machine (I bothered you about this a couple of months ago in case you don?t remember, see https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7143__;!!G_uCfscf7eWS!f4svx7Rv1mmcLfy5l0C9bXXrw9gwb49ykkTb28IAtZW0VgZ8vgdD8exUOZSL0TCEqqP5X-p-0llrLiE4GQ$ ). >> >>> make[2]: *** [../../Makefile.tail:46: libs] Bus error (core dumped) >> >> Ah - ok - that's a strange error. I'm not sure how to debug it. [it fails when the build is invoked from configure - but not when its invoked directly from bash/shell.] > > Pushed a potential workaround to jolivet/test-openblas And here we go: https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/jobs/6420606887__;!!G_uCfscf7eWS!ZRl7bXHfAYjDN_AxaP28sbWmVsW1LJNw3_FdSSjv_R3X7Ol03i_HRQZ-5iro-4Y-w6JpmqnJrp6g33qwH26Uag$ 20 minutes in, and still in the dm_* tests with timeouts right, left, and center. For reference, this prior job https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/jobs/6418468279__;!!G_uCfscf7eWS!ZRl7bXHfAYjDN_AxaP28sbWmVsW1LJNw3_FdSSjv_R3X7Ol03i_HRQZ-5iro-4Y-w6JpmqnJrp6g33rzzaakGw$ completed in 3 minutes (OK, maybe add a couple of minutes to rebuild the packages to have a fair comparison). What did they do to OpenBLAS? Add a sleep() in their axpy? Thanks, Pierre > Note: The failure comes up on same OS (Fedora 39) on X64 aswell. > > Satish > >> >> Satish >> >>> >>> Thanks, >>> Pierre >>> >>>> >>>> Satish >>>> >>>> On Mon, 18 Mar 2024, Zongze Yang wrote: >>>> >>>>> The issue of openblas was resolved by this pr https://urldefense.us/v3/__https://github.com/OpenMathLib/OpenBLAS/pull/4565__;!!G_uCfscf7eWS!b09n5clcTFuLceLY_9KfqtSsgmmCIBLFbqciRVCKvnvFw9zTaNF8ssK0MiQlBOXUJe7H88nl-7ExdfhB-cMXLQ2d$ >>>>> >>>>> Best wishes, >>>>> Zongze >>>>> >>>>>> On 18 Mar 2024, at 00:50, Zongze Yang wrote: >>>>>> >>>>>> It can be resolved by adding CFLAGS=-Wno-int-conversion. Perhaps the default behaviour of the new version compiler has been changed? >>>>>> >>>>>> Best wishes, >>>>>> Zongze >>>>>>> On 18 Mar 2024, at 00:23, Satish Balay wrote: >>>>>>> >>>>>>> Hm - I just tried a build with balay/xcode15-mpich - and that goes through fine for me. So don't know what the difference here is. >>>>>>> >>>>>>> One difference is - I have a slightly older xcode. However your compiler appears to behave as using -Werror. Perhaps CFLAGS=-Wno-int-conversion will help here? >>>>>>> >>>>>>> Satish >>>>>>> >>>>>>> ---- >>>>>>> Executing: gcc --version >>>>>>> stdout: >>>>>>> Apple clang version 15.0.0 (clang-1500.3.9.4) >>>>>>> >>>>>>> Executing: /Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/bin/mpicc -show >>>>>>> stdout: gcc -fPIC -fno-stack-check -Qunused-arguments -g -O0 -Wno-implicit-function-declaration -fno-common -I/Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/include -L/Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/lib -lmpi -lpmpi >>>>>>> >>>>>>> /Users/zzyang/workspace/repos/petsc/arch-darwin-c-debug/bin/mpicc -O2 -DMAX_STACK_ALLOC=2048 -Wall -DF_INTERFACE_GFORT -fPIC -DNO_WARMUP -DMAX_CPU_NUMBER=12 -DMAX_PARALLEL_NUMBER=1 -DBUILD_SINGLE=1 -DBUILD_DOUBLE=1 -DBUILD_COMPLEX=1 -DBUILD_COMPLEX16=1 -DVERSION=\"0.3.21\" -march=armv8-a -UASMNAME -UASMFNAME -UNAME -UCNAME -UCHAR_NAME -UCHAR_CNAME -DASMNAME=_lapack_wrappers -DASMFNAME=_lapack_wrappers_ -DNAME=lapack_wrappers_ -DCNAME=lapack_wrappers -DCHAR_NAME=\"lapack_wrappers_\" -DCHAR_CNAME=\"lapack_wrappers\" -DNO_AFFINITY -I.. -c src/lapack_wrappers.c -o src/lapack_wrappers.o >>>>>>> src/lapack_wrappers.c:570:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>>>>> RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>>>>> ^~~~ >>>>>>> & >>>>>>> >>>>>>> vs: >>>>>>> Executing: gcc --version >>>>>>> stdout: >>>>>>> Apple clang version 15.0.0 (clang-1500.1.0.2.5) >>>>>>> >>>>>>> Executing: /Users/balay/petsc/arch-darwin-c-debug/bin/mpicc -show >>>>>>> stdout: gcc -fPIC -fno-stack-check -Qunused-arguments -g -O0 -Wno-implicit-function-declaration -fno-common -I/Users/balay/petsc/arch-darwin-c-debug/include -L/Users/balay/petsc/arch-darwin-c-debug/lib -lmpi -lpmpi >>>>>>> >>>>>>> >>>>>>> /Users/balay/petsc/arch-darwin-c-debug/bin/mpicc -O2 -DMAX_STACK_ALLOC=2048 -Wall -DF_INTERFACE_GFORT -fPIC -DNO_WARMUP -DMAX_CPU_NUMBER=24 -DMAX_PARALLEL_NUMBER=1 -DBUILD_SINGLE=1 -DBUILD_DOUBLE=1 -DBUILD_COMPLEX=1 -DBUILD_COMPLEX16=1 -DVERSION=\"0.3.21\" -march=armv8-a -UASMNAME -UASMFNAME -UNAME -UCNAME -UCHAR_NAME -UCHAR_CNAME -DASMNAME=_lapack_wrappers -DASMFNAME=_lapack_wrappers_ -DNAME=lapack_wrappers_ -DCNAME=lapack_wrappers -DCHAR_NAME=\"lapack_wrappers_\" -DCHAR_CNAME=\"lapack_wrappers\" -DNO_AFFINITY -I.. -c src/lapack_wrappers.c -o src/lapack_wrappers.o >>>>>>> src/lapack_wrappers.c:570:81: warning: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>>>>> RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>>>>> ^~~~ >>>>>>> & >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Sun, 17 Mar 2024, Pierre Jolivet wrote: >>>>>>> >>>>>>>> Ah, my bad, I misread linux-opt-arm as a macOS runner, no wonder the option is not helping? >>>>>>>> Take Barry?s advice. >>>>>>>> Furthermore, it looks like OpenBLAS people are steering in the opposite direction as us, by forcing the use of ld-classic https://urldefense.us/v3/__https://github.com/OpenMathLib/OpenBLAS/commit/103d6f4e42fbe532ae4ea48e8d90d7d792bc93d2__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SrazFoooQ$ , so that?s another good argument in favor of -framework Accelerate. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Pierre >>>>>>>> >>>>>>>> PS: anyone benchmarked those https://urldefense.us/v3/__https://developer.apple.com/documentation/accelerate/sparse_solvers__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SrpnDvT5g$ ? I didn?t even know they existed. >>>>>>>> >>>>>>>>> On 17 Mar 2024, at 3:06?PM, Zongze Yang > wrote: >>>>>>>>> >>>>>>>>> This Message Is From an External Sender >>>>>>>>> This message came from outside your organization. >>>>>>>>> Understood. Thank you for your advice. >>>>>>>>> >>>>>>>>> Best wishes, >>>>>>>>> Zongze >>>>>>>>> >>>>>>>>>> On 17 Mar 2024, at 22:04, Barry Smith > wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I would just avoid the --download-openblas option. The BLAS/LAPACK provided by Apple should perform fine, perhaps even better than OpenBLAS on your system. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On Mar 17, 2024, at 9:58?AM, Zongze Yang > wrote: >>>>>>>>>>> >>>>>>>>>>> This Message Is From an External Sender >>>>>>>>>>> This message came from outside your organization. >>>>>>>>>>> Adding the flag `--download-openblas-make-options=TARGET=GENERIC` did not resolve the issue. The same error persisted. >>>>>>>>>>> >>>>>>>>>>> Best wishes, >>>>>>>>>>> Zongze >>>>>>>>>>> >>>>>>>>>>>> On 17 Mar 2024, at 20:58, Pierre Jolivet > wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> On 17 Mar 2024, at 1:04?PM, Zongze Yang > wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Thank you for providing the instructions. I try the first option. >>>>>>>>>>>>> >>>>>>>>>>>>> Now, the error of the configuration is related to OpenBLAS. >>>>>>>>>>>>> Add `--CFLAGS=-Wno-int-conversion` to configure command resolve this. Should this be reported to OpenBLAS? Or need to fix the configure in petsc? >>>>>>>>>>>> >>>>>>>>>>>> I see our linux-opt-arm runner is using the additional flag '--download-openblas-make-options=TARGET=GENERIC', could you maybe try to add that as well? >>>>>>>>>>>> I don?t think there is much to fix on our end, OpenBLAS has been very broken lately on arm (current version is 0.3.26 but we can?t update because there is a huge performance regression which makes the pipeline timeout). >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Pierre >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> The configure.log is attached. The errors are show below: >>>>>>>>>>>>> ``` >>>>>>>>>>>>> src/lapack_wrappers.c:570:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>>>>>>>>>>> RELAPACK_sgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>>>>>>>>>>> ^~~~ >>>>>>>>>>>>> & >>>>>>>>>>>>> src/../inc/relapack.h:74:216: note: passing argument to parameter here >>>>>>>>>>>>> void RELAPACK_sgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); >>>>>>>>>>>>> ^ >>>>>>>>>>>>> src/lapack_wrappers.c:583:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>>>>>>>>>>> RELAPACK_dgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>>>>>>>>>>> ^~~~ >>>>>>>>>>>>> & >>>>>>>>>>>>> src/../inc/relapack.h:75:221: note: passing argument to parameter here >>>>>>>>>>>>> void RELAPACK_dgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); >>>>>>>>>>>>> ^ >>>>>>>>>>>>> src/lapack_wrappers.c:596:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>>>>>>>>>>> RELAPACK_cgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>>>>>>>>>>> ^~~~ >>>>>>>>>>>>> & >>>>>>>>>>>>> src/../inc/relapack.h:76:216: note: passing argument to parameter here >>>>>>>>>>>>> void RELAPACK_cgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const float *, const float *, const blasint *, const float *, const blasint *, const float *, float *, const blasint *); >>>>>>>>>>>>> ^ >>>>>>>>>>>>> src/lapack_wrappers.c:609:81: error: incompatible integer to pointer conversion passing 'blasint' (aka 'int') to parameter of type 'const blasint *' (aka 'const int *'); take the address with & [-Wint-conversion] >>>>>>>>>>>>> RELAPACK_zgemmt(uplo, transA, transB, n, k, alpha, A, ldA, B, ldB, beta, C, info); >>>>>>>>>>>>> ^~~~ >>>>>>>>>>>>> & >>>>>>>>>>>>> src/../inc/relapack.h:77:221: note: passing argument to parameter here >>>>>>>>>>>>> void RELAPACK_zgemmt(const char *, const char *, const char *, const blasint *, const blasint *, const double *, const double *, const blasint *, const double *, const blasint *, const double *, double *, const blasint *); >>>>>>>>>>>>> ^ >>>>>>>>>>>>> 4 errors generated. >>>>>>>>>>>>> ``` >>>>>>>>>>>>> >>>>>>>>>>>>> Best wishes, >>>>>>>>>>>>> Zongze >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> On 17 Mar 2024, at 18:48, Pierre Jolivet > wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> You need this MR https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7365__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9SqG8HOUGQ$ >>>>>>>>>>>>>> main has been broken for macOS since https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7341__;!!G_uCfscf7eWS!bY2l3X9Eb5PRzNQYrfPFXhgcUodHCiDinhQYga0PeQn1IQzJYD376fk-pZfktGAkpTvBmzy7BFDc9Soe8Kh_uQ$ , so the alternative is to revert to the commit prior. >>>>>>>>>>>>>> It should work either way. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Pierre >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 17 Mar 2024, at 11:31?AM, Zongze Yang > wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ? >>>>>>>>>>>>>>> This Message Is From an External Sender >>>>>>>>>>>>>>> This message came from outside your organization. >>>>>>>>>>>>>>> Hi, PETSc Team, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I am trying to install petsc with the following configuration >>>>>>>>>>>>>>> ``` >>>>>>>>>>>>>>> ./configure \ >>>>>>>>>>>>>>> --download-bison \ >>>>>>>>>>>>>>> --download-mpich \ >>>>>>>>>>>>>>> --download-mpich-configure-arguments=--disable-opencl \ >>>>>>>>>>>>>>> --download-hwloc \ >>>>>>>>>>>>>>> --download-hwloc-configure-arguments=--disable-opencl \ >>>>>>>>>>>>>>> --download-openblas \ >>>>>>>>>>>>>>> --download-openblas-make-options="'USE_THREAD=0 USE_LOCKING=1 USE_OPENMP=0'" \ >>>>>>>>>>>>>>> --with-shared-libraries=1 \ >>>>>>>>>>>>>>> --with-fortran-bindings=0 \ >>>>>>>>>>>>>>> --with-zlib \ >>>>>>>>>>>>>>> LDFLAGS=-Wl,-ld_classic >>>>>>>>>>>>>>> ``` >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The log shows that >>>>>>>>>>>>>>> ``` >>>>>>>>>>>>>>> Exhausted all shared linker guesses. Could not determine how to create a shared library! >>>>>>>>>>>>>>> ``` >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I recently updated the system and Xcode, as well as homebrew. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The configure.log is attached. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks for your attention to this matter. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Best wishes, >>>>>>>>>>>>>>> Zongze >>>>>>>>>>>>>>> >>>>>>>> >>>>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Mon Mar 18 18:31:54 2024 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 18 Mar 2024 18:31:54 -0500 (CDT) Subject: [petsc-users] Install PETSc with option `--with-shared-libraries=1` failed on MacOS In-Reply-To: References: <93DAC71E-10F8-48FC-A2E1-DA90F64129DB@gmail.com> <333E0143-AFCA-41F6-BD72-EBCF88C6BFBD@joliv.et> <7C476492-F3A6-417B-9A71-61129CC00BB2@gmail.com> <4713306B-F97E-4853-9E86-20BA4C818810@joliv.et> <8178ED7E-487A-432C-9D9B-08A7707CCF87@gmail.com> <0A79AA1A-F2D6-4415-B668-CB7323B3B3F8@petsc.dev> <8ef9433e-6061-2d88-970f-bd492a82bc0f@mcs.anl.gov> <48DE8BA9-58A2-4607-841E-098895F94157@joliv.et> <51474162-f43c-c16e-dd4c-d52a2c23e291@mcs.anl.gov> <5d37040f-6c6e-b318-dbdd-944c8e9b3881@mcs.anl.gov> Message-ID: On Mon, 18 Mar 2024, Pierre Jolivet wrote: > > And here we go: https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/jobs/6420606887__;!!G_uCfscf7eWS!alfBlmyFQ5JJUYKxxFdETav6xjHOl5W54BPrmJEyXdSakVXnj8eYIRZdknOI-FK4uiaPdL4zSdJlD2zrcw$ > 20 minutes in, and still in the dm_* tests with timeouts right, left, and center. > For reference, this prior job https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/jobs/6418468279__;!!G_uCfscf7eWS!alfBlmyFQ5JJUYKxxFdETav6xjHOl5W54BPrmJEyXdSakVXnj8eYIRZdknOI-FK4uiaPdL4zSdJj83LENQ$ completed in 3 minutes (OK, maybe add a couple of minutes to rebuild the packages to have a fair comparison). > What did they do to OpenBLAS? Add a sleep() in their axpy? (gdb) r Starting program: /home/petsc/petsc/src/dm/dt/tests/ex13 ^C Program received signal SIGINT, Interrupt. 0x0000fffff331ad10 in dgemm_otcopy (m=m at entry=8, n=n at entry=7, a=a at entry=0x58f150, lda=lda at entry=15, b=b at entry=0xffffefae0000) at ../kernel/arm64/../generic/gemm_tcopy_2.c:69 69 *(b_offset1 + 3) = *(a_offset2 + 1); (gdb) where #0 0x0000fffff331ad10 in dgemm_otcopy (m=m at entry=8, n=n at entry=7, a=a at entry=0x58f150, lda=lda at entry=15, b=b at entry=0xffffefae0000) at ../kernel/arm64/../generic/gemm_tcopy_2.c:69 #1 0x0000fffff3342e68 in dgetrf_single (args=args at entry=0xffffffffe9d8, range_m=range_m at entry=0x0, range_n=range_n at entry=0x0, sa=sa at entry=0xffffefae0000, sb=, myid=myid at entry=0) at getrf_single.c:157 #2 0x0000fffff3255ec4 in dgetrf_ (M=, N=, a=, ldA=, ipiv=, Info=0xffffffffeaa8) at lapack/getrf.c:110 #3 0x0000fffff50b8dd8 in MatLUFactor_SeqDense (A=0x598360, row=0x0, col=0x0, minfo=0xffffffffeba8) at /home/petsc/petsc/src/mat/impls/dense/seq/dense.c:801 #4 0x0000fffff559b8b4 in MatLUFactor (mat=0x598360, row=0x0, col=0x0, info=0xffffffffeba8) at /home/petsc/petsc/src/mat/interface/matrix.c:3087 #5 0x00000000004149e0 in test (dim=2, deg=3, form=-1, jetDegree=3, cond=PETSC_FALSE) at ex13.c:141 #6 0x0000000000418f20 in main (argc=1, argv=0xfffffffff158) at ex13.c:303 (gdb) It appears to get stuck in a loop here. This test runs fine - if I remove "--download-openblas-make-options=TARGET=GENERIC" option. Ok - trying out "git bisect" ea6c5f3cf553a23f8e2e787307805e7874e1f9c6 is the first bad commit commit ea6c5f3cf553a23f8e2e787307805e7874e1f9c6 Author: Martin Kroeker Date: Sun Oct 30 12:55:23 2022 +0100 Add option RELAPACK_REPLACE Makefile.rule | 5 ++++- Makefile.system | 4 ++++ 2 files changed, 8 insertions(+), 1 deletion(-) Don't really understand why this change is triggering this hang. Or the correct way to build latest openblas [do we need "BUILD_RELAPACK=1"?] Satish From bsmith at petsc.dev Mon Mar 18 23:01:42 2024 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 19 Mar 2024 00:01:42 -0400 Subject: [petsc-users] MatSetValues() can't work right In-Reply-To: References: <7E1E2114-7D96-4382-BC9A-438C78429F23@petsc.dev> Message-ID: > On Mar 18, 2024, at 9:41?PM, Waltz Jan wrote: > > Thank you for your response. > However, even after reading the Notes in https://urldefense.us/v3/__https://petsc.org/release/manualpages/DM/DMCreateMatrix/__;!!G_uCfscf7eWS!fTO1ShsqXrxcXKmKrn7uXjX68PlSaKv4RBgRvwP9BUQpeowdAqyQyxq3cSp_3H231u74LG5cJRd24lnABMYgziE$ , I am still confused. > According to the content in https://urldefense.us/v3/__https://petsc.org/release/manualpages/Mat/MatSetValues/__;!!G_uCfscf7eWS!fTO1ShsqXrxcXKmKrn7uXjX68PlSaKv4RBgRvwP9BUQpeowdAqyQyxq3cSp_3H231u74LG5cJRd24lnAAfForB0$ , idxm and idxn represent the global indices of the rows and columns, respectively. I created a matrix using DMDA with a size of 300x300, and I want to insert a value at the 101st row and 101st column. Shouldn't idxm and idxn be 100 in this case? It does go into the matrix at I,J of 101 and 101. But MatView() when the matrix comes from a DMDA prints the matrix out with a different ordering (the "natural" ordering for a 2 or 3d grid) thus printed out it is not at the 101, 101 location. If you do PetscViewerPushFormat(viewer, PETSC_VIEWER_NATIVE); before calling MatView() then MatView will not print the matrix out in natural order, it will print it out as it is stored in parallel and you will see the location at 101, 101. > But when NP=6, the inserted value appears at a different position. > MatSetValuesStencil() allows each process to set entries for its points. You can use DMDAGetGhostCorners() to determine the allowed range of i,j,k that can be used in the MatStencil for each particular MPI process. See for example src/snes/tests/ex20.c the function FormJacobian(). It is for finite difference matrices where points only speak to their neighbors in the matrix. It does not support setting arbitrary I, J locations into a matrix from arbitrary processes. > I tried using MatSetValuesStencil, but the code threw an error with the following message: > CODES: > #include > #include > #include > #include > #include > #include > #include > > int main() > { > PetscInitialize(NULL, NULL, NULL, NULL); > DM da; > DMDACreate3d(PETSC_COMM_WORLD, DM_BOUNDARY_GHOSTED, DM_BOUNDARY_GHOSTED, DM_BOUNDARY_GHOSTED, DMDA_STENCIL_STAR, > 10, 1, 10, PETSC_DECIDE, PETSC_DECIDE, PETSC_DECIDE, 3, 1, NULL, NULL, NULL, &da); > DMSetFromOptions(da); > DMSetUp(da); > Mat Jac; > DMCreateMatrix(da, &Jac); > > double val = 1.; > > MatStencil st{5,1,5,0}; > MatSetValuesStencil(Jac, 1, &st, 1, &st, &val, INSERT_VALUES); > MatAssemblyBegin(Jac, MAT_FINAL_ASSEMBLY); > MatAssemblyEnd(Jac, MAT_FINAL_ASSEMBLY); > > PetscViewer viewer; > PetscViewerASCIIOpen(PETSC_COMM_WORLD, "./jacobianmatrix.m", &viewer); > PetscViewerPushFormat(viewer, PETSC_VIEWER_ASCII_MATLAB); > MatView(Jac, viewer); > PetscViewerDestroy(&viewer); > > PetscFinalize(); > } > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Argument out of range > [0]PETSC ERROR: Local index 438 too large 377 (max) at 0 > [0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!fTO1ShsqXrxcXKmKrn7uXjX68PlSaKv4RBgRvwP9BUQpeowdAqyQyxq3cSp_3H231u74LG5cJRd24lnAY72xwtI$ for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.20.4, Jan 29, 2024 > [0]PETSC ERROR: Unknown Name on a arch-linux-cxx-debug named TUF-Gaming by lei Tue Mar 19 09:31:01 2024 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack --download-hypre --with-debugging=yes --download-mpich --with-clanguage=cxx > [0]PETSC ERROR: #1 ISLocalToGlobalMappingApply() at /home/lei/Software/PETSc/petsc-3.20.4/src/vec/is/utils/isltog.c:789 > [0]PETSC ERROR: #2 MatSetValuesLocal() at /home/lei/Software/PETSc/petsc-3.20.4/src/mat/interface/matrix.c:2408 > [0]PETSC ERROR: #3 MatSetValuesStencil() at /home/lei/Software/PETSc/petsc-3.20.4/src/mat/interface/matrix.c:1762 > [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [1]PETSC ERROR: Argument out of range > [1]PETSC ERROR: Local index 423 too large 377 (max) at 0 > [1]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!fTO1ShsqXrxcXKmKrn7uXjX68PlSaKv4RBgRvwP9BUQpeowdAqyQyxq3cSp_3H231u74LG5cJRd24lnAY72xwtI$ for trouble shooting. > [1]PETSC ERROR: Petsc Release Version 3.20.4, Jan 29, 2024 > [1]PETSC ERROR: Unknown Name on a arch-linux-cxx-debug named TUF-Gaming by lei Tue Mar 19 09:31:01 2024 > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack --download-hypre --with-debugging=yes --download-mpich --with-clanguage=cxx > [1]PETSC ERROR: #1 ISLocalToGlobalMappingApply() at /home/lei/Software/PETSc/petsc-3.20.4/src/vec/is/utils/isltog.c:789 > [1]PETSC ERROR: #2 MatSetValuesLocal() at /home/lei/Software/PETSc/petsc-3.20.4/src/mat/interface/matrix.c:2408 > [1]PETSC ERROR: #3 MatSetValuesStencil() at /home/lei/Software/PETSc/petsc-3.20.4/src/mat/interface/matrix.c:1762 > > Is it not possible to set values across processors using MatSetValuesStencil? If I want to set values of the matrix across processors, what should I do? > I am really confused, and I would greatly appreciate your help. > > On Mon, Mar 18, 2024 at 9:28?PM Barry Smith > wrote: >> >> The output is correct (only confusing). For PETSc DMDA by default viewing a parallel matrix converts it to the "natural" ordering instead of the PETSc parallel ordering. >> >> See the Notes in https://urldefense.us/v3/__https://petsc.org/release/manualpages/DM/DMCreateMatrix/__;!!G_uCfscf7eWS!fTO1ShsqXrxcXKmKrn7uXjX68PlSaKv4RBgRvwP9BUQpeowdAqyQyxq3cSp_3H231u74LG5cJRd24lnABMYgziE$ >> >> Barry >> >> >>> On Mar 18, 2024, at 8:06?AM, Waltz Jan > wrote: >>> >>> This Message Is From an External Sender >>> This message came from outside your organization. >>> PETSc version: 3.20.4 >>> Program: >>> #include >>> #include >>> #include >>> #include >>> #include >>> #include >>> #include >>> >>> int main() >>> { >>> PetscInitialize(NULL, NULL, NULL, NULL); >>> DM da; >>> DMDACreate3d(PETSC_COMM_WORLD, DM_BOUNDARY_GHOSTED, DM_BOUNDARY_GHOSTED, DM_BOUNDARY_GHOSTED, DMDA_STENCIL_STAR, >>> 10, 1, 10, PETSC_DECIDE, PETSC_DECIDE, PETSC_DECIDE, 3, 1, NULL, NULL, NULL, &da); >>> DMSetFromOptions(da); >>> DMSetUp(da); >>> Mat Jac; >>> DMCreateMatrix(da, &Jac); >>> int row = 100, col = 100; >>> double val = 1.; >>> MatSetValues(Jac, 1, &row, 1, &col, &val, INSERT_VALUES); >>> MatAssemblyBegin(Jac, MAT_FINAL_ASSEMBLY); >>> MatAssemblyEnd(Jac, MAT_FINAL_ASSEMBLY); >>> >>> PetscViewer viewer; >>> PetscViewerASCIIOpen(PETSC_COMM_WORLD, "./jacobianmatrix.m", &viewer); >>> PetscViewerPushFormat(viewer, PETSC_VIEWER_ASCII_MATLAB); >>> MatView(Jac, viewer); >>> PetscViewerDestroy(&viewer); >>> >>> PetscFinalize(); >>> } >>> >>> When I ran the program with np = 6, I got the result as the below >>> >>> It's obviously wrong. >>> When I ran the program with np = 1 or 8, I got the right result as >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Mar 19 09:15:52 2024 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 19 Mar 2024 10:15:52 -0400 Subject: [petsc-users] Using PetscPartitioner on WINDOWS In-Reply-To: <75a1141e.2227.18e56094a01.Coremail.ctchengben@mail.scut.edu.cn> References: <51939dcb.1af6.18e4fb57e91.Coremail.202321009113@mail.scut.edu.cn> <75a1141e.2227.18e56094a01.Coremail.ctchengben@mail.scut.edu.cn> Message-ID: <309691C2-F01F-4F1B-BB51-F4E1E5C49711@petsc.dev> Are you not able to use PETSc 3.20.2 ? > On Mar 19, 2024, at 5:27?AM, ?? wrote: > > Hi,Barry > > I try to use PETSc version 3.19.5 on windows, but it encounter a problem. > > > > ********************************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > --------------------------------------------------------------------------------------------- > Error configuring METIS with CMake > ********************************************************************************************* > > configure.log is attached. > > > > Looking forward to your reply! > > sinserely, > > Ben. > > > > > -----????----- > ???: "Barry Smith" > > ????: 2024-03-18 21:11:14 (???) > ???: ?? <202321009113 at mail.scut.edu.cn > > ??: petsc-users at mcs.anl.gov > ??: Re: [petsc-users] Using PetscPartitioner on WINDOWS > > > Please switch to the latest PETSc version, it supports Metis and Parmetis on Windows. > > Barry > > >> On Mar 17, 2024, at 11:57?PM, ?? <202321009113 at mail.scut.edu.cn > wrote: >> >> This Message Is From an External Sender >> This message came from outside your organization. >> Hello? >> >> Recently I try to install PETSc with Cygwin since I'd like to use PETSc with Visual Studio on Windows10 plateform.For the sake of clarity, I firstly list the softwares/packages used below: >> 1. PETSc: version 3.16.5 >> 2. VS: version 2022 >> 3. Intel MPI: download Intel oneAPI Base Toolkit and HPC Toolkit >> 4. Cygwin >> >> >> On windows, >> Then I try to calculate a simple cantilever beam that use Tetrahedral mesh. So it's unstructured grid >> I use DMPlexCreateFromFile() to creat dmplex. >> And then I want to distributing the mesh for using PETSCPARTITIONERPARMETIS type(in my opinion this PetscPartitioner type maybe the best for dmplex, >> >> see fig 1 for my work to see different PetscPartitioner type about a cantilever beam in Linux system.) >> >> But unfortunatly, when i try to use parmetis on windows that configure PETSc as follows >> >> >> ./configure --with-debugging=0 --with-cc='win32fe cl' --with-fc='win32fe ifort' --with-cxx='win32fe cl' >> >> --download-fblaslapack=/cygdrive/g/mypetsc/petsc-pkg-fblaslapack-e8a03f57d64c.tar.gz --with-shared-libraries=0 >> >> --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include >> --with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/lib/release/impi.lib >> --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec >> --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-475d8facbb32.tar.gz >> --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-ca7a59e6283f.tar.gz >> >> >> >> >> it shows that >> ******************************************************************************* >> External package metis does not support --download-metis with Microsoft compilers >> ******************************************************************************* >> configure.log and make.log is attached >> >> >> >> If I use PetscPartitioner Simple type the calculate time is much more than PETSCPARTITIONERPARMETIS type. >> So On windows system I want to use PetscPartitioner like parmetis , if there have any other PetscPartitioner type that can do the same work as parmetis, >> >> or I just try to download parmetis separatly on windows(like this website , https://urldefense.us/v3/__https://boogie.inm.ras.ru/terekhov/INMOST/-/wikis/0204-Compilation-ParMETIS-Windows__;!!G_uCfscf7eWS!YjNk7-j6Bla9TDZ2LHFM7dpkmHTGuxJ0-TrkpEtHlCCJ5YGZFVNamiFcoky1BXf2rBhBOfbV3okjYwjHJIDLMs8$ )? >> and then use Visual Studio to use it's library I don't know in this way PETSc could use it successfully or not. >> >> >> >> So I wrrit this email to report my problem and ask for your help. >> >> Looking forward your reply! >> >> >> sinserely, >> Ben. >> > ? -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 1667139 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From ajaramillopalma at gmail.com Tue Mar 19 10:07:50 2024 From: ajaramillopalma at gmail.com (Alfredo Jaramillo) Date: Tue, 19 Mar 2024 09:07:50 -0600 Subject: [petsc-users] issue with cmake Message-ID: Dear developers, Please excuse me if this issue is a bit off-topic. I'm trying to compile dolfinx. There is one compilation step where cmake is used with the CMakeLists.txt in this link: https://urldefense.us/v3/__https://github.com/FEniCS/dolfinx/blob/main/cpp/CMakeLists.txt__;!!G_uCfscf7eWS!fO0ZzQfd4dg7-t5nIfqoY8yaP-aPJfvnkuYMQy6S4_BkbdH7bUVHSl331_PvqeNJtWBCKqR1lKQXd5uktKEo0WvNO5JaujY$ where PETSC configuration is got by doing: =================================================== # Check for PETSc find_package(PkgConfig REQUIRED) set(ENV{PKG_CONFIG_PATH} "$ENV{PETSC_DIR}/$ENV{PETSC_ARCH}/lib/pkgconfig:$ENV{PETSC_DIR}/lib/pkgconfig:$ENV{PKG_CONFIG_PATH}" ) pkg_search_module(PETSC REQUIRED IMPORTED_TARGET PETSc>=3.15 petsc>=3.15) # Check if PETSc build uses real or complex scalars (this is configured in # DOLFINxConfig.cmake.in) include(CheckSymbolExists) set(CMAKE_REQUIRED_INCLUDES ${PETSC_INCLUDE_DIRS}) =================================================== However, compilation fails as it is not able to find some headers, I printed the variable PETSC_INCLUDE_DIRS and it is empty. So to get it to compile I included the line: include_directories($ENV{PETSC_DIR}/include $ENV{PETSC_DIR}/$ENV{PETSC_ARCH}/include) My env variables are well set. I'm attaching my configure.log file. Could this be related to PETSC configuration, and its interaction with cmake? Thank you so much. Alfredo -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: text/x-log Size: 2124693 bytes Desc: not available URL: From balay at mcs.anl.gov Tue Mar 19 11:48:57 2024 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 19 Mar 2024 11:48:57 -0500 (CDT) Subject: [petsc-users] Using PetscPartitioner on WINDOWS In-Reply-To: <309691C2-F01F-4F1B-BB51-F4E1E5C49711@petsc.dev> References: <51939dcb.1af6.18e4fb57e91.Coremail.202321009113@mail.scut.edu.cn> <75a1141e.2227.18e56094a01.Coremail.ctchengben@mail.scut.edu.cn> <309691C2-F01F-4F1B-BB51-F4E1E5C49711@petsc.dev> Message-ID: Check https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/jobs/6412623047__;!!G_uCfscf7eWS!ZAg_b85bAvm8-TShDMHvxaXIu77pjwlDqU2g9AXQSNNw0gmk3peDktdf8MsGAq3jHLTJHo6WSPGyEe5QrCJ-fN0$ for a successful build of latest petsc-3.20 [i.e release branch in git] with metis and parmetis Note the usage: >>>>> '--with-cc=cl', '--with-cxx=cl', '--with-fc=ifort', <<<< Satish On Tue, 19 Mar 2024, Barry Smith wrote: > Are you not able to use PETSc 3.?20.?2 ? On Mar 19, 2024, at 5:?27 AM, ?? wrote: Hi,Barry I try to use PETSc version 3.?19.?5 on windows, but it encounter a problem. > ********************************************************************************************* > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > ? > ZjQcmQRYFpfptBannerEnd > > ? Are you not able to use PETSc 3.20.2 ? > > On Mar 19, 2024, at 5:27?AM, ?? wrote: > > Hi,Barry > > I try to?use PETSc version?3.19.5 on windows, but it encounter a problem. > > > ?********************************************************************************************* > ? ? ? ? ? ?UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > --------------------------------------------------------------------------------------------- > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Error configuring METIS with CMake > ********************************************************************************************* > > configure.log is attached. > > > Looking forward to your reply! > > sinserely, > > Ben. > > > > -----????----- > ???:?"Barry Smith" > ????:?2024-03-18 21:11:14 (???) > ???:??? <202321009113 at mail.scut.edu.cn> > ??:?petsc-users at mcs.anl.gov > ??:?Re: [petsc-users] Using PetscPartitioner on WINDOWS > > > Please switch to the latest PETSc version, it supports Metis and Parmetis on Windows. > ? Barry > > > On Mar 17, 2024, at 11:57?PM, ?? <202321009113 at mail.scut.edu.cn> wrote: > > This Message Is From an External Sender? > This message came from outside your organization. > > Hello? > > Recently I try to install PETSc with Cygwin since I'd like to use PETSc with Visual Studio on Windows10 plateform.For the sake of clarity, I firstly list the softwares/packages used below: > 1. PETSc: version 3.16.5 > 2. VS: version 2022? > 3. Intel MPI: download Intel oneAPI Base Toolkit and HPC Toolkit > 4. Cygwin > > > On windows, > Then I try to calculate a simple cantilever beam? that use Tetrahedral mesh.? So it's? unstructured grid > I use DMPlexCreateFromFile() to creat dmplex. > > And then I want to distributing the mesh for using? PETSCPARTITIONERPARMETIS type(in my opinion this PetscPartitioner type maybe the best for dmplex, > > see fig 1 for my work to see different PetscPartitioner type about a? cantilever beam in Linux system.) > > But unfortunatly, when i try to use parmetis on windows that configure PETSc as follows > > > ?./configure? --with-debugging=0? --with-cc='win32fe cl' --with-fc='win32fe ifort' --with-cxx='win32fe cl'?? > > --download-fblaslapack=/cygdrive/g/mypetsc/petsc-pkg-fblaslapack-e8a03f57d64c.tar.gz? --with-shared-libraries=0? > > --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include > ?--with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/lib/release/impi.lib? > --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec? > --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-475d8facbb32.tar.gz? > --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-ca7a59e6283f.tar.gz? > > > > > it shows that? > ******************************************************************************* > External package metis does not support --download-metis with Microsoft compilers > ******************************************************************************* > configure.log and?make.log is attached > > > > If I use PetscPartitioner Simple type the calculate time is much more than PETSCPARTITIONERPARMETIS type. > > So On windows system I want to use PetscPartitioner like parmetis , if there have any other PetscPartitioner type that can do the same work as parmetis,? > > or I just try to download parmetis? separatly on windows(like this website ,?https://urldefense.us/v3/__https://boogie.inm.ras.ru/terekhov/INMOST/-/wikis/0204-Compilation-ParMETIS-Windows__;!!G_uCfscf7eWS!ZAg_b85bAvm8-TShDMHvxaXIu77pjwlDqU2g9AXQSNNw0gmk3peDktdf8MsGAq3jHLTJHo6WSPGyEe5Qgw3sA7A$ )? > > and then use Visual Studio to use it's library I don't know in this way PETSc could use it successfully or not. > > > So I wrrit this email to report my problem and ask for your help. > > Looking forward your reply! > > > sinserely, > Ben. > > > > > > From mfadams at lbl.gov Tue Mar 19 15:57:53 2024 From: mfadams at lbl.gov (Mark Adams) Date: Tue, 19 Mar 2024 16:57:53 -0400 Subject: [petsc-users] Running CG with HYPRE AMG preconditioner in AMD GPUs In-Reply-To: References: Message-ID: [keep on list] I have little experience with running hypre on GPUs but others might have more. 1M dogs/node is not a lot and NVIDIA has larger L1 cache and more mature compilers, etc. so it is not surprising that NVIDIA is faster. I suspect the gap would narrow with a larger problem. Also, why are you using Kokkos? It should not make a difference but you could check easily. Just use -vec_type hip with your current code. You could also test with GAMG, -pc_type gamg Mark On Tue, Mar 19, 2024 at 4:12?PM Vanella, Marcos (Fed) < marcos.vanella at nist.gov> wrote: > Hi Mark, I run a canonical test we have to time our code. It is a propane > fire on a burner within a box with around 1 million cells. > I split the problem in 4 GPUS, single node, both in Polaris and Frontier. > I compiled PETSc with gnu and HYPRE being downloaded and the following > configure options: > > > - Polaris: > $./configure COPTFLAGS="-O3" CXXOPTFLAGS="-O3" FOPTFLAGS="-O3" > FCOPTFLAGS="-O3" CUDAOPTFLAGS="-O3" --with-debugging=0 > --download-suitesparse --download-hypre --with-cuda --with-cc=cc > --with-cxx=CC --with-fc=ftn --with-cudac=nvcc --with-cuda-arch=80 > --download-cmake > > > > - Frontier: > $./configure COPTFLAGS="-O3" CXXOPTFLAGS="-O3" FOPTFLAGS="-O3" > FCOPTFLAGS="-O3" HIPOPTFLAGS="-O3" --with-debugging=0 --with-cc=cc > --with-cxx=CC --with-fc=ftn --with-hip --with-hipc=hipcc > --LIBS="-L${MPICH_DIR}/lib -lmpi ${PE_MPICH_GTL_DIR_amd_gfx90a} > ${PE_MPICH_GTL_LIBS_amd_gfx90a}" --download-kokkos > --download-kokkos-kernels --download-suitesparse --download-hypre > --download-cmake > > > Our code was compiled also with gnu compilers and -O3 flag. I used latest > (from this week) PETSc repo update. These are the timings for the test case: > > > - 8 meshes + 1Million cells case, 8 MPI processes, 4 GPUS, 2 MPI Procs > per GPU, 1 sec run time (~580 time steps, ~1160 Poisson solves): > > > System Poisson Solver GPU Implementation > Poisson Wall time (sec) Total Wall time (sec) > Polaris CG + HYPRE PC CUDA > 80 287 > Frontier CG + HYPRE PC Kokkos + HIP > 158 401 > > It is interesting to see that the Poisson solves take twice the time in > Frontier than in Polaris. > Do you have experience on running HYPRE AMG on these machines? Is this > difference between the CUDA implementation and Kokkos-kernels to be > expected? > > I can run the case in both computers with the log flags you suggest. Might > give more information on where the differences are. > > Thank you for your time, > Marcos > > > ------------------------------ > *From:* Mark Adams > *Sent:* Tuesday, March 5, 2024 2:41 PM > *To:* Vanella, Marcos (Fed) > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Running CG with HYPRE AMG preconditioner in > AMD GPUs > > You can run with -log_view_gpu_time to get rid of the nans and get more > data. > > You can run with -ksp_view to get more info on the solver and send that > output. > > -options_left is also good to use so we can see what parameters you used. > > The last 100 in this row: > > KSPSolve 1197 0.0 2.0291e+02 0.0 2.55e+11 0.0 3.9e+04 8.0e+04 > 3.1e+04 12 100 100 100 49 12 100 100 100 98 2503 -nan 0 1.80e-05 > 0 0.00e+00 100 > > tells us that all the flops were logged on GPUs. > > You do need at least 100K equations per GPU to see speedup, so don't worry > about small problems. > > Mark > > > > > On Tue, Mar 5, 2024 at 12:52?PM Vanella, Marcos (Fed) via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hi all, I compiled the latest PETSc source in Frontier using gcc+kokkos > and hip options: ./configure COPTFLAGS="-O3" CXXOPTFLAGS="-O3" > FOPTFLAGS="-O3" FCOPTFLAGS="-O3" HIPOPTFLAGS="-O3" --with-debugging=0 > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > Hi all, I compiled the latest PETSc source in Frontier using gcc+kokkos > and hip options: > > ./configure COPTFLAGS="-O3" CXXOPTFLAGS="-O3" FOPTFLAGS="-O3" > FCOPTFLAGS="-O3" HIPOPTFLAGS="-O3" --with-debugging=0 --with-cc=cc > --with-cxx=CC --with-fc=ftn --with-hip --with-hipc=hipcc > --LIBS="-L${MPICH_DIR}/lib -lmpi ${PE_MPICH_GTL_DIR_amd_gfx90a} > ${PE_MPICH_GTL_LIBS_amd_gfx90a}" --download-kokkos > --download-kokkos-kernels --download-suitesparse --download-hypre > --download-cmake > > and have started testing our code solving a Poisson linear system with CG > + HYPRE preconditioner. Timings look rather high compared to compilations > done on other machines that have NVIDIA cards. They are also not changing > when using more than one GPU for the simple test I doing. > Does anyone happen to know if HYPRE has an hip GPU implementation for > Boomer AMG and is it compiled when configuring PETSc? > > Thanks! > > Marcos > > > PS: This is what I see on the log file (-log_view) when running the case > with 2 GPUs in the node: > > > ------------------------------------------------------------------ PETSc > Performance Summary: > ------------------------------------------------------------------ > > /ccs/home/vanellam/Firemodels_fork/fds/Build/mpich_gnu_frontier/fds_mpich_gnu_frontier > on a arch-linux-frontier-opt-gcc named frontier04119 with 4 processors, by > vanellam Tue Mar 5 12:42:29 2024 > Using Petsc Development GIT revision: v3.20.5-713-gabdf6bc0fcf GIT Date: > 2024-03-05 01:04:54 +0000 > > Max Max/Min Avg Total > Time (sec): 8.368e+02 1.000 8.368e+02 > Objects: 0.000e+00 0.000 0.000e+00 > Flops: 2.546e+11 0.000 1.270e+11 5.079e+11 > Flops/sec: 3.043e+08 0.000 1.518e+08 6.070e+08 > MPI Msg Count: 1.950e+04 0.000 9.748e+03 3.899e+04 > MPI Msg Len (bytes): 1.560e+09 0.000 7.999e+04 3.119e+09 > MPI Reductions: 6.331e+04 2877.545 > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N > --> 2N flops > and VecAXPY() for complex vectors of length N > --> 8N flops > > Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages > --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total Count > %Total Avg %Total Count %Total > 0: Main Stage: 8.3676e+02 100.0% 5.0792e+11 100.0% 3.899e+04 > 100.0% 7.999e+04 100.0% 3.164e+04 50.0% > > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flop: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > AvgLen: average message length (bytes) > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and > PetscLogStagePop(). > %T - percent time in this phase %F - percent flop in this > phase > %M - percent messages in this phase %L - percent message lengths > in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over > all processors) > GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU > time over all processors) > CpuToGpu Count: total number of CPU to GPU copies per processor > CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per > processor) > GpuToCpu Count: total number of GPU to CPU copies per processor > GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per > processor) > GPU %F: percent flops on GPU in this event > > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flop > --- Global --- --- Stage ---- Total GPU - CpuToGpu - - > GpuToCpu - GPU > Max Ratio Max Ratio Max Ratio Mess AvgLen > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size > Count Size %F > > --------------------------------------------------------------------------------------------------------------------------------------------------------------- > > --- Event Stage 0: Main Stage > > BuildTwoSided 1201 0.0 nan nan 0.00e+00 0.0 2.0e+00 4.0e+00 > 6.0e+02 0 0 0 0 1 0 0 0 0 2 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > BuildTwoSidedF 1200 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 6.0e+02 0 0 0 0 1 0 0 0 0 2 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > MatMult 19494 0.0 nan nan 1.35e+11 0.0 3.9e+04 8.0e+04 > 0.0e+00 7 53 100 100 0 7 53 100 100 0 -nan -nan 0 1.80e-05 > 0 0.00e+00 100 > MatConvert 3 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.5e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > MatAssemblyBegin 2 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > MatAssemblyEnd 2 0.0 nan nan 0.00e+00 0.0 4.0e+00 2.0e+04 > 3.5e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > VecTDot 41382 0.0 nan nan 4.14e+10 0.0 0.0e+00 0.0e+00 > 2.1e+04 0 16 0 0 33 0 16 0 0 65 -nan -nan 0 0.00e+00 0 > 0.00e+00 100 > VecNorm 20691 0.0 nan nan 2.07e+10 0.0 0.0e+00 0.0e+00 > 1.0e+04 0 8 0 0 16 0 8 0 0 33 -nan -nan 0 0.00e+00 0 > 0.00e+00 100 > VecCopy 2394 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > VecSet 21888 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > VecAXPY 38988 0.0 nan nan 3.90e+10 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 15 0 0 0 0 15 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 100 > VecAYPX 18297 0.0 nan nan 1.83e+10 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 7 0 0 0 0 7 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 100 > VecAssemblyBegin 1197 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 6.0e+02 0 0 0 0 1 0 0 0 0 2 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > VecAssemblyEnd 1197 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > VecScatterBegin 19494 0.0 nan nan 0.00e+00 0.0 3.9e+04 8.0e+04 > 0.0e+00 0 0 100 100 0 0 0 100 100 0 -nan -nan 0 1.80e-05 > 0 0.00e+00 0 > VecScatterEnd 19494 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > SFSetGraph 1 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > SFSetUp 1 0.0 nan nan 0.00e+00 0.0 4.0e+00 2.0e+04 > 5.0e-01 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > SFPack 19494 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 1.80e-05 0 > 0.00e+00 0 > SFUnpack 19494 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > KSPSetUp 1 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > KSPSolve 1197 0.0 2.0291e+02 0.0 2.55e+11 0.0 3.9e+04 8.0e+04 > 3.1e+04 12 100 100 100 49 12 100 100 100 98 2503 -nan 0 1.80e-05 > 0 0.00e+00 100 > PCSetUp 1 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.5e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > PCApply 20691 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 5 0 0 0 0 5 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > > --------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Object Type Creations Destructions. Reports information only > for process 0. > > --- Event Stage 0: Main Stage > > Matrix 7 3 > Vector 7 1 > Index Set 2 2 > Star Forest Graph 1 0 > Krylov Solver 1 0 > Preconditioner 1 0 > > ======================================================================================================================== > Average time to get PetscTime(): 3.01e-08 > Average time for MPI_Barrier(): 3.8054e-06 > Average time for zero size MPI_Send(): 7.101e-06 > #PETSc Option Table entries: > -log_view # (source: command line) > -mat_type mpiaijkokkos # (source: command line) > -vec_type kokkos # (source: command line) > #End of PETSc Option Table entries > Compiled without FORTRAN kernels > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 sizeof(PetscInt) 4 > Configure options: COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 > FCOPTFLAGS=-O3 HIPOPTFLAGS=-O3 --with-debugging=0 --with-cc=cc > --with-cxx=CC --with-fc=ftn --with-hip --with-hipc=hipcc > --LIBS="-L/opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib -lmpi > -L/opt/cray/pe/mpich/8.1.23/gtl/lib -lmpi_gtl_hsa" --download-kokkos > --download-kokkos-kernels --download-suitesparse --download-hypre > --download-cmake > ----------------------------------------- > Libraries compiled on 2024-03-05 17:04:36 on login08 > Machine characteristics: > Linux-5.14.21-150400.24.46_12.0.83-cray_shasta_c-x86_64-with-glibc2.3.4 > Using PETSc directory: /autofs/nccs-svm1_home1/vanellam/Software/petsc > Using PETSc arch: arch-linux-frontier-opt-gcc > ----------------------------------------- > > Using C compiler: cc -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas > -Wno-lto-type-mismatch -Wno-stringop-overflow -fstack-protector > -fvisibility=hidden -O3 > Using Fortran compiler: ftn -fPIC -Wall -ffree-line-length-none > -ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -O3 > ----------------------------------------- > > Using include paths: > -I/autofs/nccs-svm1_home1/vanellam/Software/petsc/include > -I/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/include > -I/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/include/suitesparse > -I/opt/rocm-5.4.0/include > ----------------------------------------- > > Using C linker: cc > Using Fortran linker: ftn > Using libraries: > -Wl,-rpath,/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib > -L/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib > -lpetsc > -Wl,-rpath,/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib > -L/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib > -Wl,-rpath,/opt/rocm-5.4.0/lib -L/opt/rocm-5.4.0/lib > -Wl,-rpath,/opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib > -L/opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib > -Wl,-rpath,/opt/cray/pe/mpich/8.1.23/gtl/lib > -L/opt/cray/pe/mpich/8.1.23/gtl/lib -Wl,-rpath,/opt/cray/pe/libsci/ > 22.12.1.1/GNU/9.1/x86_64/lib -L/opt/cray/pe/libsci/ > 22.12.1.1/GNU/9.1/x86_64/lib > -Wl,-rpath,/sw/frontier/spack-envs/base/opt/cray-sles15-zen3/gcc-12.2.0/darshan-runtime-3.4.0-ftq5gccg3qjtyh5xeo2bz4wqkjayjhw3/lib > -L/sw/frontier/spack-envs/base/opt/cray-sles15-zen3/gcc-12.2.0/darshan-runtime-3.4.0-ftq5gccg3qjtyh5xeo2bz4wqkjayjhw3/lib > -Wl,-rpath,/opt/cray/pe/dsmml/0.2.2/dsmml/lib > -L/opt/cray/pe/dsmml/0.2.2/dsmml/lib -Wl,-rpath,/opt/cray/pe/pmi/6.1.8/lib > -L/opt/cray/pe/pmi/6.1.8/lib > -Wl,-rpath,/opt/cray/xpmem/2.6.2-2.5_2.22__gd067c3f.shasta/lib64 > -L/opt/cray/xpmem/2.6.2-2.5_2.22__gd067c3f.shasta/lib64 > -Wl,-rpath,/opt/cray/pe/gcc/12.2.0/snos/lib/gcc/x86_64-suse-linux/12.2.0 > -L/opt/cray/pe/gcc/12.2.0/snos/lib/gcc/x86_64-suse-linux/12.2.0 > -Wl,-rpath,/opt/cray/pe/gcc/12.2.0/snos/lib64 > -L/opt/cray/pe/gcc/12.2.0/snos/lib64 -Wl,-rpath,/opt/rocm-5.4.0/llvm/lib > -L/opt/rocm-5.4.0/llvm/lib -Wl,-rpath,/opt/cray/pe/gcc/12.2.0/snos/lib > -L/opt/cray/pe/gcc/12.2.0/snos/lib -lHYPRE -lspqr -lumfpack -lklu -lcholmod > -lamd -lkokkoskernels -lkokkoscontainers -lkokkoscore -lkokkossimd > -lhipsparse -lhipblas -lhipsolver -lrocsparse -lrocsolver -lrocblas > -lrocrand -lamdhip64 -lmpi -lmpi_gtl_hsa -ldarshan -lz -ldl -lxpmem > -lgfortran -lm -lmpifort_gnu_91 -lmpi_gnu_91 -lsci_gnu_82_mpi -lsci_gnu_82 > -ldsmml -lpmi -lpmi2 -lgfortran -lquadmath -lpthread -lm -lgcc_s -lstdc++ > -lquadmath -lmpi -lmpi_gtl_hsa > ----------------------------------------- > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcos.vanella at nist.gov Tue Mar 19 16:06:51 2024 From: marcos.vanella at nist.gov (Vanella, Marcos (Fed)) Date: Tue, 19 Mar 2024 21:06:51 +0000 Subject: [petsc-users] Running CG with HYPRE AMG preconditioner in AMD GPUs In-Reply-To: References: Message-ID: Hi Mark, thanks. I'll try your suggestions. So, I would keep -mat_type mpiaijkokkos but -vec_type hip as runtime options? Thanks, Marcos ________________________________ From: Mark Adams Sent: Tuesday, March 19, 2024 4:57 PM To: Vanella, Marcos (Fed) Cc: PETSc users list Subject: Re: [petsc-users] Running CG with HYPRE AMG preconditioner in AMD GPUs [keep on list] I have little experience with running hypre on GPUs but others might have more. 1M dogs/node is not a lot and NVIDIA has larger L1 cache and more mature compilers, etc. so it is not surprising that NVIDIA is faster. I suspect the gap would narrow with a larger problem. Also, why are you using Kokkos? It should not make a difference but you could check easily. Just use -vec_type hip with your current code. You could also test with GAMG, -pc_type gamg Mark On Tue, Mar 19, 2024 at 4:12?PM Vanella, Marcos (Fed) > wrote: Hi Mark, I run a canonical test we have to time our code. It is a propane fire on a burner within a box with around 1 million cells. I split the problem in 4 GPUS, single node, both in Polaris and Frontier. I compiled PETSc with gnu and HYPRE being downloaded and the following configure options: * Polaris: $./configure COPTFLAGS="-O3" CXXOPTFLAGS="-O3" FOPTFLAGS="-O3" FCOPTFLAGS="-O3" CUDAOPTFLAGS="-O3" --with-debugging=0 --download-suitesparse --download-hypre --with-cuda --with-cc=cc --with-cxx=CC --with-fc=ftn --with-cudac=nvcc --with-cuda-arch=80 --download-cmake * Frontier: $./configure COPTFLAGS="-O3" CXXOPTFLAGS="-O3" FOPTFLAGS="-O3" FCOPTFLAGS="-O3" HIPOPTFLAGS="-O3" --with-debugging=0 --with-cc=cc --with-cxx=CC --with-fc=ftn --with-hip --with-hipc=hipcc --LIBS="-L${MPICH_DIR}/lib -lmpi ${PE_MPICH_GTL_DIR_amd_gfx90a} ${PE_MPICH_GTL_LIBS_amd_gfx90a}" --download-kokkos --download-kokkos-kernels --download-suitesparse --download-hypre --download-cmake Our code was compiled also with gnu compilers and -O3 flag. I used latest (from this week) PETSc repo update. These are the timings for the test case: * 8 meshes + 1Million cells case, 8 MPI processes, 4 GPUS, 2 MPI Procs per GPU, 1 sec run time (~580 time steps, ~1160 Poisson solves): System Poisson Solver GPU Implementation Poisson Wall time (sec) Total Wall time (sec) Polaris CG + HYPRE PC CUDA 80 287 Frontier CG + HYPRE PC Kokkos + HIP 158 401 It is interesting to see that the Poisson solves take twice the time in Frontier than in Polaris. Do you have experience on running HYPRE AMG on these machines? Is this difference between the CUDA implementation and Kokkos-kernels to be expected? I can run the case in both computers with the log flags you suggest. Might give more information on where the differences are. Thank you for your time, Marcos ________________________________ From: Mark Adams > Sent: Tuesday, March 5, 2024 2:41 PM To: Vanella, Marcos (Fed) > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Running CG with HYPRE AMG preconditioner in AMD GPUs You can run with -log_view_gpu_time to get rid of the nans and get more data. You can run with -ksp_view to get more info on the solver and send that output. -options_left is also good to use so we can see what parameters you used. The last 100 in this row: KSPSolve 1197 0.0 2.0291e+02 0.0 2.55e+11 0.0 3.9e+04 8.0e+04 3.1e+04 12 100 100 100 49 12 100 100 100 98 2503 -nan 0 1.80e-05 0 0.00e+00 100 tells us that all the flops were logged on GPUs. You do need at least 100K equations per GPU to see speedup, so don't worry about small problems. Mark On Tue, Mar 5, 2024 at 12:52?PM Vanella, Marcos (Fed) via petsc-users > wrote: Hi all, I compiled the latest PETSc source in Frontier using gcc+kokkos and hip options: ./configure COPTFLAGS="-O3" CXXOPTFLAGS="-O3" FOPTFLAGS="-O3" FCOPTFLAGS="-O3" HIPOPTFLAGS="-O3" --with-debugging=0 ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi all, I compiled the latest PETSc source in Frontier using gcc+kokkos and hip options: ./configure COPTFLAGS="-O3" CXXOPTFLAGS="-O3" FOPTFLAGS="-O3" FCOPTFLAGS="-O3" HIPOPTFLAGS="-O3" --with-debugging=0 --with-cc=cc --with-cxx=CC --with-fc=ftn --with-hip --with-hipc=hipcc --LIBS="-L${MPICH_DIR}/lib -lmpi ${PE_MPICH_GTL_DIR_amd_gfx90a} ${PE_MPICH_GTL_LIBS_amd_gfx90a}" --download-kokkos --download-kokkos-kernels --download-suitesparse --download-hypre --download-cmake and have started testing our code solving a Poisson linear system with CG + HYPRE preconditioner. Timings look rather high compared to compilations done on other machines that have NVIDIA cards. They are also not changing when using more than one GPU for the simple test I doing. Does anyone happen to know if HYPRE has an hip GPU implementation for Boomer AMG and is it compiled when configuring PETSc? Thanks! Marcos PS: This is what I see on the log file (-log_view) when running the case with 2 GPUs in the node: ------------------------------------------------------------------ PETSc Performance Summary: ------------------------------------------------------------------ /ccs/home/vanellam/Firemodels_fork/fds/Build/mpich_gnu_frontier/fds_mpich_gnu_frontier on a arch-linux-frontier-opt-gcc named frontier04119 with 4 processors, by vanellam Tue Mar 5 12:42:29 2024 Using Petsc Development GIT revision: v3.20.5-713-gabdf6bc0fcf GIT Date: 2024-03-05 01:04:54 +0000 Max Max/Min Avg Total Time (sec): 8.368e+02 1.000 8.368e+02 Objects: 0.000e+00 0.000 0.000e+00 Flops: 2.546e+11 0.000 1.270e+11 5.079e+11 Flops/sec: 3.043e+08 0.000 1.518e+08 6.070e+08 MPI Msg Count: 1.950e+04 0.000 9.748e+03 3.899e+04 MPI Msg Len (bytes): 1.560e+09 0.000 7.999e+04 3.119e+09 MPI Reductions: 6.331e+04 2877.545 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total Count %Total Avg %Total Count %Total 0: Main Stage: 8.3676e+02 100.0% 5.0792e+11 100.0% 3.899e+04 100.0% 7.999e+04 100.0% 3.164e+04 50.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent AvgLen: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors) CpuToGpu Count: total number of CPU to GPU copies per processor CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor) GpuToCpu Count: total number of GPU to CPU copies per processor GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor) GPU %F: percent flops on GPU in this event ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F --------------------------------------------------------------------------------------------------------------------------------------------------------------- --- Event Stage 0: Main Stage BuildTwoSided 1201 0.0 nan nan 0.00e+00 0.0 2.0e+00 4.0e+00 6.0e+02 0 0 0 0 1 0 0 0 0 2 -nan -nan 0 0.00e+00 0 0.00e+00 0 BuildTwoSidedF 1200 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+02 0 0 0 0 1 0 0 0 0 2 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatMult 19494 0.0 nan nan 1.35e+11 0.0 3.9e+04 8.0e+04 0.0e+00 7 53 100 100 0 7 53 100 100 0 -nan -nan 0 1.80e-05 0 0.00e+00 100 MatConvert 3 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 1.5e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatAssemblyBegin 2 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatAssemblyEnd 2 0.0 nan nan 0.00e+00 0.0 4.0e+00 2.0e+04 3.5e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecTDot 41382 0.0 nan nan 4.14e+10 0.0 0.0e+00 0.0e+00 2.1e+04 0 16 0 0 33 0 16 0 0 65 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecNorm 20691 0.0 nan nan 2.07e+10 0.0 0.0e+00 0.0e+00 1.0e+04 0 8 0 0 16 0 8 0 0 33 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecCopy 2394 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecSet 21888 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecAXPY 38988 0.0 nan nan 3.90e+10 0.0 0.0e+00 0.0e+00 0.0e+00 0 15 0 0 0 0 15 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecAYPX 18297 0.0 nan nan 1.83e+10 0.0 0.0e+00 0.0e+00 0.0e+00 0 7 0 0 0 0 7 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecAssemblyBegin 1197 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+02 0 0 0 0 1 0 0 0 0 2 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecAssemblyEnd 1197 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecScatterBegin 19494 0.0 nan nan 0.00e+00 0.0 3.9e+04 8.0e+04 0.0e+00 0 0 100 100 0 0 0 100 100 0 -nan -nan 0 1.80e-05 0 0.00e+00 0 VecScatterEnd 19494 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SFSetGraph 1 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SFSetUp 1 0.0 nan nan 0.00e+00 0.0 4.0e+00 2.0e+04 5.0e-01 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SFPack 19494 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 1.80e-05 0 0.00e+00 0 SFUnpack 19494 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 KSPSetUp 1 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 KSPSolve 1197 0.0 2.0291e+02 0.0 2.55e+11 0.0 3.9e+04 8.0e+04 3.1e+04 12 100 100 100 49 12 100 100 100 98 2503 -nan 0 1.80e-05 0 0.00e+00 100 PCSetUp 1 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 1.5e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 PCApply 20691 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 5 0 0 0 0 5 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 --------------------------------------------------------------------------------------------------------------------------------------------------------------- Object Type Creations Destructions. Reports information only for process 0. --- Event Stage 0: Main Stage Matrix 7 3 Vector 7 1 Index Set 2 2 Star Forest Graph 1 0 Krylov Solver 1 0 Preconditioner 1 0 ======================================================================================================================== Average time to get PetscTime(): 3.01e-08 Average time for MPI_Barrier(): 3.8054e-06 Average time for zero size MPI_Send(): 7.101e-06 #PETSc Option Table entries: -log_view # (source: command line) -mat_type mpiaijkokkos # (source: command line) -vec_type kokkos # (source: command line) #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 FCOPTFLAGS=-O3 HIPOPTFLAGS=-O3 --with-debugging=0 --with-cc=cc --with-cxx=CC --with-fc=ftn --with-hip --with-hipc=hipcc --LIBS="-L/opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib -lmpi -L/opt/cray/pe/mpich/8.1.23/gtl/lib -lmpi_gtl_hsa" --download-kokkos --download-kokkos-kernels --download-suitesparse --download-hypre --download-cmake ----------------------------------------- Libraries compiled on 2024-03-05 17:04:36 on login08 Machine characteristics: Linux-5.14.21-150400.24.46_12.0.83-cray_shasta_c-x86_64-with-glibc2.3.4 Using PETSc directory: /autofs/nccs-svm1_home1/vanellam/Software/petsc Using PETSc arch: arch-linux-frontier-opt-gcc ----------------------------------------- Using C compiler: cc -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-stringop-overflow -fstack-protector -fvisibility=hidden -O3 Using Fortran compiler: ftn -fPIC -Wall -ffree-line-length-none -ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -O3 ----------------------------------------- Using include paths: -I/autofs/nccs-svm1_home1/vanellam/Software/petsc/include -I/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/include -I/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/include/suitesparse -I/opt/rocm-5.4.0/include ----------------------------------------- Using C linker: cc Using Fortran linker: ftn Using libraries: -Wl,-rpath,/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib -L/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib -lpetsc -Wl,-rpath,/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib -L/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib -Wl,-rpath,/opt/rocm-5.4.0/lib -L/opt/rocm-5.4.0/lib -Wl,-rpath,/opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib -L/opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib -Wl,-rpath,/opt/cray/pe/mpich/8.1.23/gtl/lib -L/opt/cray/pe/mpich/8.1.23/gtl/lib -Wl,-rpath,/opt/cray/pe/libsci/22.12.1.1/GNU/9.1/x86_64/lib -L/opt/cray/pe/libsci/22.12.1.1/GNU/9.1/x86_64/lib -Wl,-rpath,/sw/frontier/spack-envs/base/opt/cray-sles15-zen3/gcc-12.2.0/darshan-runtime-3.4.0-ftq5gccg3qjtyh5xeo2bz4wqkjayjhw3/lib -L/sw/frontier/spack-envs/base/opt/cray-sles15-zen3/gcc-12.2.0/darshan-runtime-3.4.0-ftq5gccg3qjtyh5xeo2bz4wqkjayjhw3/lib -Wl,-rpath,/opt/cray/pe/dsmml/0.2.2/dsmml/lib -L/opt/cray/pe/dsmml/0.2.2/dsmml/lib -Wl,-rpath,/opt/cray/pe/pmi/6.1.8/lib -L/opt/cray/pe/pmi/6.1.8/lib -Wl,-rpath,/opt/cray/xpmem/2.6.2-2.5_2.22__gd067c3f.shasta/lib64 -L/opt/cray/xpmem/2.6.2-2.5_2.22__gd067c3f.shasta/lib64 -Wl,-rpath,/opt/cray/pe/gcc/12.2.0/snos/lib/gcc/x86_64-suse-linux/12.2.0 -L/opt/cray/pe/gcc/12.2.0/snos/lib/gcc/x86_64-suse-linux/12.2.0 -Wl,-rpath,/opt/cray/pe/gcc/12.2.0/snos/lib64 -L/opt/cray/pe/gcc/12.2.0/snos/lib64 -Wl,-rpath,/opt/rocm-5.4.0/llvm/lib -L/opt/rocm-5.4.0/llvm/lib -Wl,-rpath,/opt/cray/pe/gcc/12.2.0/snos/lib -L/opt/cray/pe/gcc/12.2.0/snos/lib -lHYPRE -lspqr -lumfpack -lklu -lcholmod -lamd -lkokkoskernels -lkokkoscontainers -lkokkoscore -lkokkossimd -lhipsparse -lhipblas -lhipsolver -lrocsparse -lrocsolver -lrocblas -lrocrand -lamdhip64 -lmpi -lmpi_gtl_hsa -ldarshan -lz -ldl -lxpmem -lgfortran -lm -lmpifort_gnu_91 -lmpi_gnu_91 -lsci_gnu_82_mpi -lsci_gnu_82 -ldsmml -lpmi -lpmi2 -lgfortran -lquadmath -lpthread -lm -lgcc_s -lstdc++ -lquadmath -lmpi -lmpi_gtl_hsa ----------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Tue Mar 19 16:15:56 2024 From: mfadams at lbl.gov (Mark Adams) Date: Tue, 19 Mar 2024 17:15:56 -0400 Subject: [petsc-users] Running CG with HYPRE AMG preconditioner in AMD GPUs In-Reply-To: References: Message-ID: You want: -mat_type aijhipsparse On Tue, Mar 19, 2024 at 5:06?PM Vanella, Marcos (Fed) < marcos.vanella at nist.gov> wrote: > Hi Mark, thanks. I'll try your suggestions. So, I would keep -mat_type > mpiaijkokkos but -vec_type hip as runtime options? > Thanks, > Marcos > ------------------------------ > *From:* Mark Adams > *Sent:* Tuesday, March 19, 2024 4:57 PM > *To:* Vanella, Marcos (Fed) > *Cc:* PETSc users list > *Subject:* Re: [petsc-users] Running CG with HYPRE AMG preconditioner in > AMD GPUs > > [keep on list] > > I have little experience with running hypre on GPUs but others might have > more. > > 1M dogs/node is not a lot and NVIDIA has larger L1 cache and more mature > compilers, etc. so it is not surprising that NVIDIA is faster. > I suspect the gap would narrow with a larger problem. > > Also, why are you using Kokkos? It should not make a difference but you > could check easily. Just use -vec_type hip with your current code. > > You could also test with GAMG, -pc_type gamg > > Mark > > > On Tue, Mar 19, 2024 at 4:12?PM Vanella, Marcos (Fed) < > marcos.vanella at nist.gov> wrote: > > Hi Mark, I run a canonical test we have to time our code. It is a propane > fire on a burner within a box with around 1 million cells. > I split the problem in 4 GPUS, single node, both in Polaris and Frontier. > I compiled PETSc with gnu and HYPRE being downloaded and the following > configure options: > > > - Polaris: > $./configure COPTFLAGS="-O3" CXXOPTFLAGS="-O3" FOPTFLAGS="-O3" > FCOPTFLAGS="-O3" CUDAOPTFLAGS="-O3" --with-debugging=0 > --download-suitesparse --download-hypre --with-cuda --with-cc=cc > --with-cxx=CC --with-fc=ftn --with-cudac=nvcc --with-cuda-arch=80 > --download-cmake > > > > - Frontier: > $./configure COPTFLAGS="-O3" CXXOPTFLAGS="-O3" FOPTFLAGS="-O3" > FCOPTFLAGS="-O3" HIPOPTFLAGS="-O3" --with-debugging=0 --with-cc=cc > --with-cxx=CC --with-fc=ftn --with-hip --with-hipc=hipcc > --LIBS="-L${MPICH_DIR}/lib -lmpi ${PE_MPICH_GTL_DIR_amd_gfx90a} > ${PE_MPICH_GTL_LIBS_amd_gfx90a}" --download-kokkos > --download-kokkos-kernels --download-suitesparse --download-hypre > --download-cmake > > > Our code was compiled also with gnu compilers and -O3 flag. I used latest > (from this week) PETSc repo update. These are the timings for the test case: > > > - 8 meshes + 1Million cells case, 8 MPI processes, 4 GPUS, 2 MPI Procs > per GPU, 1 sec run time (~580 time steps, ~1160 Poisson solves): > > > System Poisson Solver GPU Implementation > Poisson Wall time (sec) Total Wall time (sec) > Polaris CG + HYPRE PC CUDA > 80 287 > Frontier CG + HYPRE PC Kokkos + HIP > 158 401 > > It is interesting to see that the Poisson solves take twice the time in > Frontier than in Polaris. > Do you have experience on running HYPRE AMG on these machines? Is this > difference between the CUDA implementation and Kokkos-kernels to be > expected? > > I can run the case in both computers with the log flags you suggest. Might > give more information on where the differences are. > > Thank you for your time, > Marcos > > > ------------------------------ > *From:* Mark Adams > *Sent:* Tuesday, March 5, 2024 2:41 PM > *To:* Vanella, Marcos (Fed) > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Running CG with HYPRE AMG preconditioner in > AMD GPUs > > You can run with -log_view_gpu_time to get rid of the nans and get more > data. > > You can run with -ksp_view to get more info on the solver and send that > output. > > -options_left is also good to use so we can see what parameters you used. > > The last 100 in this row: > > KSPSolve 1197 0.0 2.0291e+02 0.0 2.55e+11 0.0 3.9e+04 8.0e+04 > 3.1e+04 12 100 100 100 49 12 100 100 100 98 2503 -nan 0 1.80e-05 > 0 0.00e+00 100 > > tells us that all the flops were logged on GPUs. > > You do need at least 100K equations per GPU to see speedup, so don't worry > about small problems. > > Mark > > > > > On Tue, Mar 5, 2024 at 12:52?PM Vanella, Marcos (Fed) via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hi all, I compiled the latest PETSc source in Frontier using gcc+kokkos > and hip options: ./configure COPTFLAGS="-O3" CXXOPTFLAGS="-O3" > FOPTFLAGS="-O3" FCOPTFLAGS="-O3" HIPOPTFLAGS="-O3" --with-debugging=0 > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > Hi all, I compiled the latest PETSc source in Frontier using gcc+kokkos > and hip options: > > ./configure COPTFLAGS="-O3" CXXOPTFLAGS="-O3" FOPTFLAGS="-O3" > FCOPTFLAGS="-O3" HIPOPTFLAGS="-O3" --with-debugging=0 --with-cc=cc > --with-cxx=CC --with-fc=ftn --with-hip --with-hipc=hipcc > --LIBS="-L${MPICH_DIR}/lib -lmpi ${PE_MPICH_GTL_DIR_amd_gfx90a} > ${PE_MPICH_GTL_LIBS_amd_gfx90a}" --download-kokkos > --download-kokkos-kernels --download-suitesparse --download-hypre > --download-cmake > > and have started testing our code solving a Poisson linear system with CG > + HYPRE preconditioner. Timings look rather high compared to compilations > done on other machines that have NVIDIA cards. They are also not changing > when using more than one GPU for the simple test I doing. > Does anyone happen to know if HYPRE has an hip GPU implementation for > Boomer AMG and is it compiled when configuring PETSc? > > Thanks! > > Marcos > > > PS: This is what I see on the log file (-log_view) when running the case > with 2 GPUs in the node: > > > ------------------------------------------------------------------ PETSc > Performance Summary: > ------------------------------------------------------------------ > > /ccs/home/vanellam/Firemodels_fork/fds/Build/mpich_gnu_frontier/fds_mpich_gnu_frontier > on a arch-linux-frontier-opt-gcc named frontier04119 with 4 processors, by > vanellam Tue Mar 5 12:42:29 2024 > Using Petsc Development GIT revision: v3.20.5-713-gabdf6bc0fcf GIT Date: > 2024-03-05 01:04:54 +0000 > > Max Max/Min Avg Total > Time (sec): 8.368e+02 1.000 8.368e+02 > Objects: 0.000e+00 0.000 0.000e+00 > Flops: 2.546e+11 0.000 1.270e+11 5.079e+11 > Flops/sec: 3.043e+08 0.000 1.518e+08 6.070e+08 > MPI Msg Count: 1.950e+04 0.000 9.748e+03 3.899e+04 > MPI Msg Len (bytes): 1.560e+09 0.000 7.999e+04 3.119e+09 > MPI Reductions: 6.331e+04 2877.545 > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N > --> 2N flops > and VecAXPY() for complex vectors of length N > --> 8N flops > > Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages > --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total Count > %Total Avg %Total Count %Total > 0: Main Stage: 8.3676e+02 100.0% 5.0792e+11 100.0% 3.899e+04 > 100.0% 7.999e+04 100.0% 3.164e+04 50.0% > > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flop: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > AvgLen: average message length (bytes) > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and > PetscLogStagePop(). > %T - percent time in this phase %F - percent flop in this > phase > %M - percent messages in this phase %L - percent message lengths > in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over > all processors) > GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU > time over all processors) > CpuToGpu Count: total number of CPU to GPU copies per processor > CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per > processor) > GpuToCpu Count: total number of GPU to CPU copies per processor > GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per > processor) > GPU %F: percent flops on GPU in this event > > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flop > --- Global --- --- Stage ---- Total GPU - CpuToGpu - - > GpuToCpu - GPU > Max Ratio Max Ratio Max Ratio Mess AvgLen > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size > Count Size %F > > --------------------------------------------------------------------------------------------------------------------------------------------------------------- > > --- Event Stage 0: Main Stage > > BuildTwoSided 1201 0.0 nan nan 0.00e+00 0.0 2.0e+00 4.0e+00 > 6.0e+02 0 0 0 0 1 0 0 0 0 2 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > BuildTwoSidedF 1200 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 6.0e+02 0 0 0 0 1 0 0 0 0 2 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > MatMult 19494 0.0 nan nan 1.35e+11 0.0 3.9e+04 8.0e+04 > 0.0e+00 7 53 100 100 0 7 53 100 100 0 -nan -nan 0 1.80e-05 > 0 0.00e+00 100 > MatConvert 3 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.5e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > MatAssemblyBegin 2 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > MatAssemblyEnd 2 0.0 nan nan 0.00e+00 0.0 4.0e+00 2.0e+04 > 3.5e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > VecTDot 41382 0.0 nan nan 4.14e+10 0.0 0.0e+00 0.0e+00 > 2.1e+04 0 16 0 0 33 0 16 0 0 65 -nan -nan 0 0.00e+00 0 > 0.00e+00 100 > VecNorm 20691 0.0 nan nan 2.07e+10 0.0 0.0e+00 0.0e+00 > 1.0e+04 0 8 0 0 16 0 8 0 0 33 -nan -nan 0 0.00e+00 0 > 0.00e+00 100 > VecCopy 2394 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > VecSet 21888 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > VecAXPY 38988 0.0 nan nan 3.90e+10 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 15 0 0 0 0 15 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 100 > VecAYPX 18297 0.0 nan nan 1.83e+10 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 7 0 0 0 0 7 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 100 > VecAssemblyBegin 1197 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 6.0e+02 0 0 0 0 1 0 0 0 0 2 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > VecAssemblyEnd 1197 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > VecScatterBegin 19494 0.0 nan nan 0.00e+00 0.0 3.9e+04 8.0e+04 > 0.0e+00 0 0 100 100 0 0 0 100 100 0 -nan -nan 0 1.80e-05 > 0 0.00e+00 0 > VecScatterEnd 19494 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > SFSetGraph 1 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > SFSetUp 1 0.0 nan nan 0.00e+00 0.0 4.0e+00 2.0e+04 > 5.0e-01 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > SFPack 19494 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 1.80e-05 0 > 0.00e+00 0 > SFUnpack 19494 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > KSPSetUp 1 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > KSPSolve 1197 0.0 2.0291e+02 0.0 2.55e+11 0.0 3.9e+04 8.0e+04 > 3.1e+04 12 100 100 100 49 12 100 100 100 98 2503 -nan 0 1.80e-05 > 0 0.00e+00 100 > PCSetUp 1 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.5e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > PCApply 20691 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 5 0 0 0 0 5 0 0 0 0 -nan -nan 0 0.00e+00 0 > 0.00e+00 0 > > --------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Object Type Creations Destructions. Reports information only > for process 0. > > --- Event Stage 0: Main Stage > > Matrix 7 3 > Vector 7 1 > Index Set 2 2 > Star Forest Graph 1 0 > Krylov Solver 1 0 > Preconditioner 1 0 > > ======================================================================================================================== > Average time to get PetscTime(): 3.01e-08 > Average time for MPI_Barrier(): 3.8054e-06 > Average time for zero size MPI_Send(): 7.101e-06 > #PETSc Option Table entries: > -log_view # (source: command line) > -mat_type mpiaijkokkos # (source: command line) > -vec_type kokkos # (source: command line) > #End of PETSc Option Table entries > Compiled without FORTRAN kernels > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 sizeof(PetscInt) 4 > Configure options: COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 > FCOPTFLAGS=-O3 HIPOPTFLAGS=-O3 --with-debugging=0 --with-cc=cc > --with-cxx=CC --with-fc=ftn --with-hip --with-hipc=hipcc > --LIBS="-L/opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib -lmpi > -L/opt/cray/pe/mpich/8.1.23/gtl/lib -lmpi_gtl_hsa" --download-kokkos > --download-kokkos-kernels --download-suitesparse --download-hypre > --download-cmake > ----------------------------------------- > Libraries compiled on 2024-03-05 17:04:36 on login08 > Machine characteristics: > Linux-5.14.21-150400.24.46_12.0.83-cray_shasta_c-x86_64-with-glibc2.3.4 > Using PETSc directory: /autofs/nccs-svm1_home1/vanellam/Software/petsc > Using PETSc arch: arch-linux-frontier-opt-gcc > ----------------------------------------- > > Using C compiler: cc -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas > -Wno-lto-type-mismatch -Wno-stringop-overflow -fstack-protector > -fvisibility=hidden -O3 > Using Fortran compiler: ftn -fPIC -Wall -ffree-line-length-none > -ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -O3 > ----------------------------------------- > > Using include paths: > -I/autofs/nccs-svm1_home1/vanellam/Software/petsc/include > -I/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/include > -I/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/include/suitesparse > -I/opt/rocm-5.4.0/include > ----------------------------------------- > > Using C linker: cc > Using Fortran linker: ftn > Using libraries: > -Wl,-rpath,/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib > -L/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib > -lpetsc > -Wl,-rpath,/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib > -L/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib > -Wl,-rpath,/opt/rocm-5.4.0/lib -L/opt/rocm-5.4.0/lib > -Wl,-rpath,/opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib > -L/opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib > -Wl,-rpath,/opt/cray/pe/mpich/8.1.23/gtl/lib > -L/opt/cray/pe/mpich/8.1.23/gtl/lib -Wl,-rpath,/opt/cray/pe/libsci/ > 22.12.1.1/GNU/9.1/x86_64/lib -L/opt/cray/pe/libsci/ > 22.12.1.1/GNU/9.1/x86_64/lib > -Wl,-rpath,/sw/frontier/spack-envs/base/opt/cray-sles15-zen3/gcc-12.2.0/darshan-runtime-3.4.0-ftq5gccg3qjtyh5xeo2bz4wqkjayjhw3/lib > -L/sw/frontier/spack-envs/base/opt/cray-sles15-zen3/gcc-12.2.0/darshan-runtime-3.4.0-ftq5gccg3qjtyh5xeo2bz4wqkjayjhw3/lib > -Wl,-rpath,/opt/cray/pe/dsmml/0.2.2/dsmml/lib > -L/opt/cray/pe/dsmml/0.2.2/dsmml/lib -Wl,-rpath,/opt/cray/pe/pmi/6.1.8/lib > -L/opt/cray/pe/pmi/6.1.8/lib > -Wl,-rpath,/opt/cray/xpmem/2.6.2-2.5_2.22__gd067c3f.shasta/lib64 > -L/opt/cray/xpmem/2.6.2-2.5_2.22__gd067c3f.shasta/lib64 > -Wl,-rpath,/opt/cray/pe/gcc/12.2.0/snos/lib/gcc/x86_64-suse-linux/12.2.0 > -L/opt/cray/pe/gcc/12.2.0/snos/lib/gcc/x86_64-suse-linux/12.2.0 > -Wl,-rpath,/opt/cray/pe/gcc/12.2.0/snos/lib64 > -L/opt/cray/pe/gcc/12.2.0/snos/lib64 -Wl,-rpath,/opt/rocm-5.4.0/llvm/lib > -L/opt/rocm-5.4.0/llvm/lib -Wl,-rpath,/opt/cray/pe/gcc/12.2.0/snos/lib > -L/opt/cray/pe/gcc/12.2.0/snos/lib -lHYPRE -lspqr -lumfpack -lklu -lcholmod > -lamd -lkokkoskernels -lkokkoscontainers -lkokkoscore -lkokkossimd > -lhipsparse -lhipblas -lhipsolver -lrocsparse -lrocsolver -lrocblas > -lrocrand -lamdhip64 -lmpi -lmpi_gtl_hsa -ldarshan -lz -ldl -lxpmem > -lgfortran -lm -lmpifort_gnu_91 -lmpi_gnu_91 -lsci_gnu_82_mpi -lsci_gnu_82 > -ldsmml -lpmi -lpmi2 -lgfortran -lquadmath -lpthread -lm -lgcc_s -lstdc++ > -lquadmath -lmpi -lmpi_gtl_hsa > ----------------------------------------- > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcos.vanella at nist.gov Tue Mar 19 16:21:11 2024 From: marcos.vanella at nist.gov (Vanella, Marcos (Fed)) Date: Tue, 19 Mar 2024 21:21:11 +0000 Subject: [petsc-users] Running CG with HYPRE AMG preconditioner in AMD GPUs In-Reply-To: References: Message-ID: Ok, thanks. I'll try it when the machine comes back online. Cheers, M ________________________________ From: Mark Adams Sent: Tuesday, March 19, 2024 5:15 PM To: Vanella, Marcos (Fed) Cc: PETSc users list Subject: Re: [petsc-users] Running CG with HYPRE AMG preconditioner in AMD GPUs You want: -mat_type aijhipsparse On Tue, Mar 19, 2024 at 5:06?PM Vanella, Marcos (Fed) > wrote: Hi Mark, thanks. I'll try your suggestions. So, I would keep -mat_type mpiaijkokkos but -vec_type hip as runtime options? Thanks, Marcos ________________________________ From: Mark Adams > Sent: Tuesday, March 19, 2024 4:57 PM To: Vanella, Marcos (Fed) > Cc: PETSc users list > Subject: Re: [petsc-users] Running CG with HYPRE AMG preconditioner in AMD GPUs [keep on list] I have little experience with running hypre on GPUs but others might have more. 1M dogs/node is not a lot and NVIDIA has larger L1 cache and more mature compilers, etc. so it is not surprising that NVIDIA is faster. I suspect the gap would narrow with a larger problem. Also, why are you using Kokkos? It should not make a difference but you could check easily. Just use -vec_type hip with your current code. You could also test with GAMG, -pc_type gamg Mark On Tue, Mar 19, 2024 at 4:12?PM Vanella, Marcos (Fed) > wrote: Hi Mark, I run a canonical test we have to time our code. It is a propane fire on a burner within a box with around 1 million cells. I split the problem in 4 GPUS, single node, both in Polaris and Frontier. I compiled PETSc with gnu and HYPRE being downloaded and the following configure options: * Polaris: $./configure COPTFLAGS="-O3" CXXOPTFLAGS="-O3" FOPTFLAGS="-O3" FCOPTFLAGS="-O3" CUDAOPTFLAGS="-O3" --with-debugging=0 --download-suitesparse --download-hypre --with-cuda --with-cc=cc --with-cxx=CC --with-fc=ftn --with-cudac=nvcc --with-cuda-arch=80 --download-cmake * Frontier: $./configure COPTFLAGS="-O3" CXXOPTFLAGS="-O3" FOPTFLAGS="-O3" FCOPTFLAGS="-O3" HIPOPTFLAGS="-O3" --with-debugging=0 --with-cc=cc --with-cxx=CC --with-fc=ftn --with-hip --with-hipc=hipcc --LIBS="-L${MPICH_DIR}/lib -lmpi ${PE_MPICH_GTL_DIR_amd_gfx90a} ${PE_MPICH_GTL_LIBS_amd_gfx90a}" --download-kokkos --download-kokkos-kernels --download-suitesparse --download-hypre --download-cmake Our code was compiled also with gnu compilers and -O3 flag. I used latest (from this week) PETSc repo update. These are the timings for the test case: * 8 meshes + 1Million cells case, 8 MPI processes, 4 GPUS, 2 MPI Procs per GPU, 1 sec run time (~580 time steps, ~1160 Poisson solves): System Poisson Solver GPU Implementation Poisson Wall time (sec) Total Wall time (sec) Polaris CG + HYPRE PC CUDA 80 287 Frontier CG + HYPRE PC Kokkos + HIP 158 401 It is interesting to see that the Poisson solves take twice the time in Frontier than in Polaris. Do you have experience on running HYPRE AMG on these machines? Is this difference between the CUDA implementation and Kokkos-kernels to be expected? I can run the case in both computers with the log flags you suggest. Might give more information on where the differences are. Thank you for your time, Marcos ________________________________ From: Mark Adams > Sent: Tuesday, March 5, 2024 2:41 PM To: Vanella, Marcos (Fed) > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Running CG with HYPRE AMG preconditioner in AMD GPUs You can run with -log_view_gpu_time to get rid of the nans and get more data. You can run with -ksp_view to get more info on the solver and send that output. -options_left is also good to use so we can see what parameters you used. The last 100 in this row: KSPSolve 1197 0.0 2.0291e+02 0.0 2.55e+11 0.0 3.9e+04 8.0e+04 3.1e+04 12 100 100 100 49 12 100 100 100 98 2503 -nan 0 1.80e-05 0 0.00e+00 100 tells us that all the flops were logged on GPUs. You do need at least 100K equations per GPU to see speedup, so don't worry about small problems. Mark On Tue, Mar 5, 2024 at 12:52?PM Vanella, Marcos (Fed) via petsc-users > wrote: Hi all, I compiled the latest PETSc source in Frontier using gcc+kokkos and hip options: ./configure COPTFLAGS="-O3" CXXOPTFLAGS="-O3" FOPTFLAGS="-O3" FCOPTFLAGS="-O3" HIPOPTFLAGS="-O3" --with-debugging=0 ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi all, I compiled the latest PETSc source in Frontier using gcc+kokkos and hip options: ./configure COPTFLAGS="-O3" CXXOPTFLAGS="-O3" FOPTFLAGS="-O3" FCOPTFLAGS="-O3" HIPOPTFLAGS="-O3" --with-debugging=0 --with-cc=cc --with-cxx=CC --with-fc=ftn --with-hip --with-hipc=hipcc --LIBS="-L${MPICH_DIR}/lib -lmpi ${PE_MPICH_GTL_DIR_amd_gfx90a} ${PE_MPICH_GTL_LIBS_amd_gfx90a}" --download-kokkos --download-kokkos-kernels --download-suitesparse --download-hypre --download-cmake and have started testing our code solving a Poisson linear system with CG + HYPRE preconditioner. Timings look rather high compared to compilations done on other machines that have NVIDIA cards. They are also not changing when using more than one GPU for the simple test I doing. Does anyone happen to know if HYPRE has an hip GPU implementation for Boomer AMG and is it compiled when configuring PETSc? Thanks! Marcos PS: This is what I see on the log file (-log_view) when running the case with 2 GPUs in the node: ------------------------------------------------------------------ PETSc Performance Summary: ------------------------------------------------------------------ /ccs/home/vanellam/Firemodels_fork/fds/Build/mpich_gnu_frontier/fds_mpich_gnu_frontier on a arch-linux-frontier-opt-gcc named frontier04119 with 4 processors, by vanellam Tue Mar 5 12:42:29 2024 Using Petsc Development GIT revision: v3.20.5-713-gabdf6bc0fcf GIT Date: 2024-03-05 01:04:54 +0000 Max Max/Min Avg Total Time (sec): 8.368e+02 1.000 8.368e+02 Objects: 0.000e+00 0.000 0.000e+00 Flops: 2.546e+11 0.000 1.270e+11 5.079e+11 Flops/sec: 3.043e+08 0.000 1.518e+08 6.070e+08 MPI Msg Count: 1.950e+04 0.000 9.748e+03 3.899e+04 MPI Msg Len (bytes): 1.560e+09 0.000 7.999e+04 3.119e+09 MPI Reductions: 6.331e+04 2877.545 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total Count %Total Avg %Total Count %Total 0: Main Stage: 8.3676e+02 100.0% 5.0792e+11 100.0% 3.899e+04 100.0% 7.999e+04 100.0% 3.164e+04 50.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent AvgLen: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors) CpuToGpu Count: total number of CPU to GPU copies per processor CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor) GpuToCpu Count: total number of GPU to CPU copies per processor GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor) GPU %F: percent flops on GPU in this event ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F --------------------------------------------------------------------------------------------------------------------------------------------------------------- --- Event Stage 0: Main Stage BuildTwoSided 1201 0.0 nan nan 0.00e+00 0.0 2.0e+00 4.0e+00 6.0e+02 0 0 0 0 1 0 0 0 0 2 -nan -nan 0 0.00e+00 0 0.00e+00 0 BuildTwoSidedF 1200 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+02 0 0 0 0 1 0 0 0 0 2 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatMult 19494 0.0 nan nan 1.35e+11 0.0 3.9e+04 8.0e+04 0.0e+00 7 53 100 100 0 7 53 100 100 0 -nan -nan 0 1.80e-05 0 0.00e+00 100 MatConvert 3 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 1.5e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatAssemblyBegin 2 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatAssemblyEnd 2 0.0 nan nan 0.00e+00 0.0 4.0e+00 2.0e+04 3.5e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecTDot 41382 0.0 nan nan 4.14e+10 0.0 0.0e+00 0.0e+00 2.1e+04 0 16 0 0 33 0 16 0 0 65 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecNorm 20691 0.0 nan nan 2.07e+10 0.0 0.0e+00 0.0e+00 1.0e+04 0 8 0 0 16 0 8 0 0 33 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecCopy 2394 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecSet 21888 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecAXPY 38988 0.0 nan nan 3.90e+10 0.0 0.0e+00 0.0e+00 0.0e+00 0 15 0 0 0 0 15 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecAYPX 18297 0.0 nan nan 1.83e+10 0.0 0.0e+00 0.0e+00 0.0e+00 0 7 0 0 0 0 7 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecAssemblyBegin 1197 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+02 0 0 0 0 1 0 0 0 0 2 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecAssemblyEnd 1197 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecScatterBegin 19494 0.0 nan nan 0.00e+00 0.0 3.9e+04 8.0e+04 0.0e+00 0 0 100 100 0 0 0 100 100 0 -nan -nan 0 1.80e-05 0 0.00e+00 0 VecScatterEnd 19494 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SFSetGraph 1 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SFSetUp 1 0.0 nan nan 0.00e+00 0.0 4.0e+00 2.0e+04 5.0e-01 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SFPack 19494 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 1.80e-05 0 0.00e+00 0 SFUnpack 19494 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 KSPSetUp 1 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 KSPSolve 1197 0.0 2.0291e+02 0.0 2.55e+11 0.0 3.9e+04 8.0e+04 3.1e+04 12 100 100 100 49 12 100 100 100 98 2503 -nan 0 1.80e-05 0 0.00e+00 100 PCSetUp 1 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 1.5e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 PCApply 20691 0.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 5 0 0 0 0 5 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 --------------------------------------------------------------------------------------------------------------------------------------------------------------- Object Type Creations Destructions. Reports information only for process 0. --- Event Stage 0: Main Stage Matrix 7 3 Vector 7 1 Index Set 2 2 Star Forest Graph 1 0 Krylov Solver 1 0 Preconditioner 1 0 ======================================================================================================================== Average time to get PetscTime(): 3.01e-08 Average time for MPI_Barrier(): 3.8054e-06 Average time for zero size MPI_Send(): 7.101e-06 #PETSc Option Table entries: -log_view # (source: command line) -mat_type mpiaijkokkos # (source: command line) -vec_type kokkos # (source: command line) #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 FCOPTFLAGS=-O3 HIPOPTFLAGS=-O3 --with-debugging=0 --with-cc=cc --with-cxx=CC --with-fc=ftn --with-hip --with-hipc=hipcc --LIBS="-L/opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib -lmpi -L/opt/cray/pe/mpich/8.1.23/gtl/lib -lmpi_gtl_hsa" --download-kokkos --download-kokkos-kernels --download-suitesparse --download-hypre --download-cmake ----------------------------------------- Libraries compiled on 2024-03-05 17:04:36 on login08 Machine characteristics: Linux-5.14.21-150400.24.46_12.0.83-cray_shasta_c-x86_64-with-glibc2.3.4 Using PETSc directory: /autofs/nccs-svm1_home1/vanellam/Software/petsc Using PETSc arch: arch-linux-frontier-opt-gcc ----------------------------------------- Using C compiler: cc -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-stringop-overflow -fstack-protector -fvisibility=hidden -O3 Using Fortran compiler: ftn -fPIC -Wall -ffree-line-length-none -ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -O3 ----------------------------------------- Using include paths: -I/autofs/nccs-svm1_home1/vanellam/Software/petsc/include -I/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/include -I/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/include/suitesparse -I/opt/rocm-5.4.0/include ----------------------------------------- Using C linker: cc Using Fortran linker: ftn Using libraries: -Wl,-rpath,/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib -L/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib -lpetsc -Wl,-rpath,/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib -L/autofs/nccs-svm1_home1/vanellam/Software/petsc/arch-linux-frontier-opt-gcc/lib -Wl,-rpath,/opt/rocm-5.4.0/lib -L/opt/rocm-5.4.0/lib -Wl,-rpath,/opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib -L/opt/cray/pe/mpich/8.1.23/ofi/gnu/9.1/lib -Wl,-rpath,/opt/cray/pe/mpich/8.1.23/gtl/lib -L/opt/cray/pe/mpich/8.1.23/gtl/lib -Wl,-rpath,/opt/cray/pe/libsci/22.12.1.1/GNU/9.1/x86_64/lib -L/opt/cray/pe/libsci/22.12.1.1/GNU/9.1/x86_64/lib -Wl,-rpath,/sw/frontier/spack-envs/base/opt/cray-sles15-zen3/gcc-12.2.0/darshan-runtime-3.4.0-ftq5gccg3qjtyh5xeo2bz4wqkjayjhw3/lib -L/sw/frontier/spack-envs/base/opt/cray-sles15-zen3/gcc-12.2.0/darshan-runtime-3.4.0-ftq5gccg3qjtyh5xeo2bz4wqkjayjhw3/lib -Wl,-rpath,/opt/cray/pe/dsmml/0.2.2/dsmml/lib -L/opt/cray/pe/dsmml/0.2.2/dsmml/lib -Wl,-rpath,/opt/cray/pe/pmi/6.1.8/lib -L/opt/cray/pe/pmi/6.1.8/lib -Wl,-rpath,/opt/cray/xpmem/2.6.2-2.5_2.22__gd067c3f.shasta/lib64 -L/opt/cray/xpmem/2.6.2-2.5_2.22__gd067c3f.shasta/lib64 -Wl,-rpath,/opt/cray/pe/gcc/12.2.0/snos/lib/gcc/x86_64-suse-linux/12.2.0 -L/opt/cray/pe/gcc/12.2.0/snos/lib/gcc/x86_64-suse-linux/12.2.0 -Wl,-rpath,/opt/cray/pe/gcc/12.2.0/snos/lib64 -L/opt/cray/pe/gcc/12.2.0/snos/lib64 -Wl,-rpath,/opt/rocm-5.4.0/llvm/lib -L/opt/rocm-5.4.0/llvm/lib -Wl,-rpath,/opt/cray/pe/gcc/12.2.0/snos/lib -L/opt/cray/pe/gcc/12.2.0/snos/lib -lHYPRE -lspqr -lumfpack -lklu -lcholmod -lamd -lkokkoskernels -lkokkoscontainers -lkokkoscore -lkokkossimd -lhipsparse -lhipblas -lhipsolver -lrocsparse -lrocsolver -lrocblas -lrocrand -lamdhip64 -lmpi -lmpi_gtl_hsa -ldarshan -lz -ldl -lxpmem -lgfortran -lm -lmpifort_gnu_91 -lmpi_gnu_91 -lsci_gnu_82_mpi -lsci_gnu_82 -ldsmml -lpmi -lpmi2 -lgfortran -lquadmath -lpthread -lm -lgcc_s -lstdc++ -lquadmath -lmpi -lmpi_gtl_hsa ----------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From ctchengben at mail.scut.edu.cn Wed Mar 20 05:14:26 2024 From: ctchengben at mail.scut.edu.cn (=?UTF-8?B?56iL5aWU?=) Date: Wed, 20 Mar 2024 18:14:26 +0800 (GMT+08:00) Subject: [petsc-users] Using PetscPartitioner on WINDOWS In-Reply-To: References: <51939dcb.1af6.18e4fb57e91.Coremail.202321009113@mail.scut.edu.cn> <75a1141e.2227.18e56094a01.Coremail.ctchengben@mail.scut.edu.cn> <309691C2-F01F-4F1B-BB51-F4E1E5C49711@petsc.dev> Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: configure-3.20.2.log URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: configure-3.20.5.log URL: From balay at mcs.anl.gov Wed Mar 20 08:29:56 2024 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 20 Mar 2024 08:29:56 -0500 (CDT) Subject: [petsc-users] Using PetscPartitioner on WINDOWS In-Reply-To: References: <51939dcb.1af6.18e4fb57e91.Coremail.202321009113@mail.scut.edu.cn> <75a1141e.2227.18e56094a01.Coremail.ctchengben@mail.scut.edu.cn> <309691C2-F01F-4F1B-BB51-F4E1E5C49711@petsc.dev> Message-ID: <0727f33c-8771-baa7-376b-a5191c9d0f16@mcs.anl.gov> >>>> Configure Options: --configModules=PETSc.Configure --optionsModule=config.compilerOptions --with-debugging=0 --with-cc=/cygdrive/g/mypetsc/petsc-3.20.2/lib/petsc/bin/win32fe/win_cl --with-fc=/cygdrive/g/mypetsc/petsc-3.20.2/lib/petsc/bin/win32fe/win_ifort --with-cxx=/cygdrive/g/mypetsc/petsc-3.20.2/lib/petsc/bin/win32fe/win_cl --with-blaslapack-lib=-L/cygdrive/g/Intel/oneAPI/mkl/2023.2.0/lib/intel64 mkl-intel-lp64-dll.lib mkl-sequential-dll.lib mkl-core-dll.lib --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include --with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/lib/release/impi.lib --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec -localonly --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-475d8facbb32.tar.gz --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-ca7a59e6283f.tar.gz --with-strict-petscerrorcode=0 <<< >>>>>>>> Warning: win32fe: File Not Found: /Ox Error: win32fe: Input File Not Found: G:\mypetsc\PETSC-~2.2\ARCH-M~1\EXTERN~1\PETSC-~1\PETSC-~1\libmetis\/Ox >>>>>>>>>> Looks like you are using an old snapshot of metis. Can you remove your local tarballs - and let [cygwin] git download the appropriate latest version? Or download and use: https://urldefense.us/v3/__https://bitbucket.org/petsc/pkg-metis/get/8b194fdf09661ac41b36fa16db0474d38f46f1ac.tar.gz__;!!G_uCfscf7eWS!dgDT-Y8-6OviQz8-EOWxbNfMKo9bSBj18tYI-sFf-iCfuyKZ10s6q3DBpWS1Ha51iWExhoSpsV69NQ8uEN5mq_A$ Similarly for parmetis https://urldefense.us/v3/__https://bitbucket.org/petsc/pkg-parmetis/get/f5e3aab04fd5fe6e09fa02f885c1c29d349f9f8b.tar.gz__;!!G_uCfscf7eWS!dgDT-Y8-6OviQz8-EOWxbNfMKo9bSBj18tYI-sFf-iCfuyKZ10s6q3DBpWS1Ha51iWExhoSpsV69NQ8uTmPyVKM$ Satish On Wed, 20 Mar 2024, ?? wrote: > Hi I try petsc-3.?20.?2 and petsc-3.?20.?5 with configure ./configure --with-debugging=0 --with-cc=cl --with-fc=ifort --with-cxx=cl --with-blaslapack-lib=-L/cygdrive/g/Intel/oneAPI/mkl/2023.?2.?0/lib/intel64 mkl-intel-lp64-dll.?lib > mkl-sequential-dll.?lib > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > ? > ZjQcmQRYFpfptBannerEnd > > Hi > I try petsc-3.20.2 and petsc-3.20.5 with configure > > ./configure --with-debugging=0 --with-cc=cl --with-fc=ifort --with-cxx=cl > --with-blaslapack-lib=-L/cygdrive/g/Intel/oneAPI/mkl/2023.2.0/lib/intel64 mkl-intel-lp64-dll.lib mkl-sequential-dll.lib mkl-core-dll.lib > --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include --with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/lib/release/impi.lib > --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec -localonly > --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-475d8facbb32.tar.gz > --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-ca7a59e6283f.tar.gz > --with-strict-petscerrorcode=0 > > but it encounter same question, > > ********************************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > --------------------------------------------------------------------------------------------- > Error running make on METIS > ********************************************************************************************* > > and I see https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/jobs/6412623047__;!!G_uCfscf7eWS!YOO7nEnwU4BJQXD3WkP3QCvaT1gfLxBxnrNdXp9SJbjETmw7uaRKaUkPRPEgWgxibROg8o_rr8SxbnaVWtJbAT-t281f2ha4Aw$ for a successful build of lates > t petsc-3.20 , it seem have something called "sowing" and "bison" , but I don't have. > > So I ask for your help, and configure.log is attached. > > sinserely, > Ben. > > > -----????----- > > ???: "Satish Balay" > > ????:2024-03-20 00:48:57 (???) > > ???: "Barry Smith" > > ??: ?? , PETSc > > ??: Re: [petsc-users] Using PetscPartitioner on WINDOWS > > > > Check https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/jobs/6412623047__;!!G_uCfscf7eWS!YOO7nEnwU4BJQXD3WkP3QCvaT1gfLxBxnrNdXp9SJbjETmw7uaRKaUkPRPEgWgxibROg8o_rr8SxbnaVWtJbAT-t281f2ha4Aw$ for a successful build of latest > petsc-3.20 [i.e release branch in git] with metis and parmetis > > > > Note the usage: > > > > >>>>> > > '--with-cc=cl', > > '--with-cxx=cl', > > '--with-fc=ifort', > > <<<< > > > > Satish > > > > On Tue, 19 Mar 2024, Barry Smith wrote: > > > > > Are you not able to use PETSc 3.?20.?2 ? On Mar 19, 2024, at 5:?27 AM, ?? wrote: Hi,Barry I try to use PETSc version 3.?19.?5 on windows, but it encounter a problem. > > > ********************************************************************************************* > > > ZjQcmQRYFpfptBannerStart > > > This Message Is From an External Sender > > > This message came from outside your organization. > > > ? > > > ZjQcmQRYFpfptBannerEnd > > > > > > ? Are you not able to use PETSc 3.20.2 ? > > > > > > On Mar 19, 2024, at 5:27?AM, ?? wrote: > > > > > > Hi,Barry > > > > > > I try to?use PETSc version?3.19.5 on windows, but it encounter a problem. > > > > > > > > > ?********************************************************************************************* > > > ? ? ? ? ? ?UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > > > --------------------------------------------------------------------------------------------- > > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Error configuring METIS with CMake > > > ********************************************************************************************* > > > > > > configure.log is attached. > > > > > > > > > Looking forward to your reply! > > > > > > sinserely, > > > > > > Ben. > > > > > > > > > > > > -----????----- > > > ???:?"Barry Smith" > > > ????:?2024-03-18 21:11:14 (???) > > > ???:??? <202321009113 at mail.scut.edu.cn> > > > ??:?petsc-users at mcs.anl.gov > > > ??:?Re: [petsc-users] Using PetscPartitioner on WINDOWS > > > > > > > > > Please switch to the latest PETSc version, it supports Metis and Parmetis on Windows. > > > ? Barry > > > > > > > > > On Mar 17, 2024, at 11:57?PM, ?? <202321009113 at mail.scut.edu.cn> wrote: > > > > > > This Message Is From an External Sender? > > > This message came from outside your organization. > > > > > > Hello? > > > > > > Recently I try to install PETSc with Cygwin since I'd like to use PETSc with Visual Studio on Windows10 plateform.For the sake of clarity, I firstly list the softwares/packages used below: > > > 1. PETSc: version 3.16.5 > > > 2. VS: version 2022? > > > 3. Intel MPI: download Intel oneAPI Base Toolkit and HPC Toolkit > > > 4. Cygwin > > > > > > > > > On windows, > > > Then I try to calculate a simple cantilever beam? that use Tetrahedral mesh.? So it's? unstructured grid > > > I use DMPlexCreateFromFile() to creat dmplex. > > > > > > And then I want to distributing the mesh for using? PETSCPARTITIONERPARMETIS type(in my opinion this PetscPartitioner type maybe the best for dmplex, > > > > > > see fig 1 for my work to see different PetscPartitioner type about a? cantilever beam in Linux system.) > > > > > > But unfortunatly, when i try to use parmetis on windows that configure PETSc as follows > > > > > > > > > ?./configure? --with-debugging=0? --with-cc='win32fe cl' --with-fc='win32fe ifort' --with-cxx='win32fe cl'?? > > > > > > --download-fblaslapack=/cygdrive/g/mypetsc/petsc-pkg-fblaslapack-e8a03f57d64c.tar.gz? --with-shared-libraries=0? > > > > > > --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include > > > ?--with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/lib/release/impi.lib? > > > --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec? > > > --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-475d8facbb32.tar.gz? > > > --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-ca7a59e6283f.tar.gz? > > > > > > > > > > > > > > > it shows that? > > > ******************************************************************************* > > > External package metis does not support --download-metis with Microsoft compilers > > > ******************************************************************************* > > > configure.log and?make.log is attached > > > > > > > > > > > > If I use PetscPartitioner Simple type the calculate time is much more than PETSCPARTITIONERPARMETIS type. > > > > > > So On windows system I want to use PetscPartitioner like parmetis , if there have any other PetscPartitioner type that can do the same work as parmetis,? > > > > > > or I just try to download parmetis? separatly on windows(like this website ,?https://urldefense.us/v3/__https://boogie.inm.ras.ru/terekhov/INMOST/-/wikis/0204-Compilation-ParMETIS-Windows__;!!G_uCfscf7eWS!YOO7nEnwU4BJQXD3WkP3QCvaT > 1gfLxBxnrNdXp9SJbjETmw7uaRKaUkPRPEgWgxibROg8o_rr8SxbnaVWtJbAT-t2815v60Yvg$)? > > > > > > and then use Visual Studio to use it's library I don't know in this way PETSc could use it successfully or not. > > > > > > > > > So I wrrit this email to report my problem and ask for your help. > > > > > > Looking forward your reply! > > > > > > > > > sinserely, > > > Ben. > > > > > > > > > > > > > > > > > > > > From ctchengben at mail.scut.edu.cn Thu Mar 21 09:40:56 2024 From: ctchengben at mail.scut.edu.cn (=?UTF-8?B?56iL5aWU?=) Date: Thu, 21 Mar 2024 22:40:56 +0800 (GMT+08:00) Subject: [petsc-users] Using PetscPartitioner on WINDOWS In-Reply-To: <0727f33c-8771-baa7-376b-a5191c9d0f16@mcs.anl.gov> References: <51939dcb.1af6.18e4fb57e91.Coremail.202321009113@mail.scut.edu.cn> <75a1141e.2227.18e56094a01.Coremail.ctchengben@mail.scut.edu.cn> <309691C2-F01F-4F1B-BB51-F4E1E5C49711@petsc.dev> <0727f33c-8771-baa7-376b-a5191c9d0f16@mcs.anl.gov> Message-ID: <8f1e656.7d8.18e617539b9.Coremail.ctchengben@mail.scut.edu.cn> An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: configure.log URL: From balay at mcs.anl.gov Thu Mar 21 09:47:40 2024 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 21 Mar 2024 09:47:40 -0500 (CDT) Subject: [petsc-users] Using PetscPartitioner on WINDOWS In-Reply-To: <8f1e656.7d8.18e617539b9.Coremail.ctchengben@mail.scut.edu.cn> References: <51939dcb.1af6.18e4fb57e91.Coremail.202321009113@mail.scut.edu.cn> <75a1141e.2227.18e56094a01.Coremail.ctchengben@mail.scut.edu.cn> <309691C2-F01F-4F1B-BB51-F4E1E5C49711@petsc.dev> <0727f33c-8771-baa7-376b-a5191c9d0f16@mcs.anl.gov> <8f1e656.7d8.18e617539b9.Coremail.ctchengben@mail.scut.edu.cn> Message-ID: <0cd51025-6f7d-587c-b155-658b5fd790b9@mcs.anl.gov> Delete your old build files - and retry. i.e rm -rf /cygdrive/g/mypetsc/petsc-3.20.5/arch-mswin-c-opt ./configure .... Satish On Thu, 21 Mar 2024, ?? wrote: > Hi, Satish Thanks for your reply, I try both way your said in petsc-3.?20.?5 but it encounter same question, ********************************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > ? > ZjQcmQRYFpfptBannerEnd > > Hi, Satish > Thanks for your reply, I try both way your said in petsc-3.20.5 > but it encounter same question, > > ********************************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > --------------------------------------------------------------------------------------------- > Error running make on METIS > ********************************************************************************************* > > I send configure.log with "--download-parmetis --download-metis" to you and ask for you help. > > sinserely, > Ben. > > > -----????----- > > ???: "Satish Balay" > > ????:2024-03-20 21:29:56 (???) > > ???: ?? > > ??: petsc-users > > ??: Re: [petsc-users] Using PetscPartitioner on WINDOWS > > > > >>>> > > Configure Options: --configModules=PETSc.Configure --optionsModule=config.compilerOptions --with-debugging=0 --with-cc=/cygdrive/g/mypetsc/petsc-3.20.2/lib/petsc/bin/win32fe/win_cl --with-fc=/cygdrive/g/mypetsc/petsc-3.20.2/lib/petsc/bin/win32fe/win_ifort --with-cxx=/cygdrive/g/mypetsc/p > etsc-3.20.2/lib/petsc/bin/win32fe/win_cl --with-blaslapack-lib=-L/cygdrive/g/Intel/oneAPI/mkl/2023.2.0/lib/intel64 mkl-intel-lp64-dll.lib mkl-sequential-dll.lib mkl-core-dll.lib --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include --with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/20 > 21.10.0/lib/release/impi.lib --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec -localonly --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-475d8facbb32.tar.gz --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-ca7a59e6283f.tar.gz --with-strict-petscerrorcode=0 > > <<< > > > > >>>>>>>> > > Warning: win32fe: File Not Found: /Ox > > Error: win32fe: Input File Not Found: G:\mypetsc\PETSC-~2.2\ARCH-M~1\EXTERN~1\PETSC-~1\PETSC-~1\libmetis\/Ox > > >>>>>>>>>> > > > > Looks like you are using an old snapshot of metis. Can you remove your local tarballs - and let [cygwin] git download the appropriate latest version? > > > > Or download and use: https://urldefense.us/v3/__https://bitbucket.org/petsc/pkg-metis/get/8b194fdf09661ac41b36fa16db0474d38f46f1ac.tar.gz__;!!G_uCfscf7eWS!cI7AqtwOwG0MWFBbcOA813z8p7Q2IZcdv53HvzHMxK37qmlicGatMh0ya5WWcEVfiYZ5JDmS7vfgveYi05xU_O8moU6wWwAikw$ > > Similarly for parmetis https://urldefense.us/v3/__https://bitbucket.org/petsc/pkg-parmetis/get/f5e3aab04fd5fe6e09fa02f885c1c29d349f9f8b.tar.gz__;!!G_uCfscf7eWS!cI7AqtwOwG0MWFBbcOA813z8p7Q2IZcdv53HvzHMxK37qmlicGatMh0ya5WWcEVfiYZ5JDmS7vfgveYi05xU_O8moU4L6tLXtg$ > > > > Satish > > > > On Wed, 20 Mar 2024, ?? wrote: > > > > > Hi I try petsc-3.?20.?2 and petsc-3.?20.?5 with configure ./configure --with-debugging=0 --with-cc=cl --with-fc=ifort --with-cxx=cl --with-blaslapack-lib=-L/cygdrive/g/Intel/oneAPI/mkl/2023.?2.?0/lib/intel64 mkl-intel-lp64-dll.?lib > > > mkl-sequential-dll.?lib > > > ZjQcmQRYFpfptBannerStart > > > This Message Is From an External Sender > > > This message came from outside your organization. > > > ? > > > ZjQcmQRYFpfptBannerEnd > > > > > > Hi > > > I try petsc-3.20.2 and petsc-3.20.5 with configure > > > > > > ./configure --with-debugging=0 --with-cc=cl --with-fc=ifort --with-cxx=cl > > > --with-blaslapack-lib=-L/cygdrive/g/Intel/oneAPI/mkl/2023.2.0/lib/intel64 mkl-intel-lp64-dll.lib mkl-sequential-dll.lib mkl-core-dll.lib > > > --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include --with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/lib/release/impi.lib > > > --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec -localonly > > > --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-475d8facbb32.tar.gz > > > --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-ca7a59e6283f.tar.gz > > > --with-strict-petscerrorcode=0 > > > > > > but it encounter same question, > > > > > > ********************************************************************************************* > > > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > > > --------------------------------------------------------------------------------------------- > > > Error running make on METIS > > > ********************************************************************************************* > > > > > > and I see https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/jobs/6412623047__;!!G_uCfscf7eWS!YOO7nEnwU4BJQXD3WkP3QCvaT1gfLxBxnrNdXp9SJbjETmw7uaRKaUkPRPEgWgxibROg8o_rr8SxbnaVWtJbAT-t281f2ha4Aw$for a successful build of lates > > > t petsc-3.20 , it seem have something called "sowing" and "bison" , but I don't have. > > > > > > So I ask for your help, and configure.log is attached. > > > > > > sinserely, > > > Ben. > > > > > > > -----????----- > > > > ???: "Satish Balay" > > > > ????:2024-03-20 00:48:57 (???) > > > > ???: "Barry Smith" > > > > ??: ?? , PETSc > > > > ??: Re: [petsc-users] Using PetscPartitioner on WINDOWS > > > > > > > > Check https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/jobs/6412623047__;!!G_uCfscf7eWS!YOO7nEnwU4BJQXD3WkP3QCvaT1gfLxBxnrNdXp9SJbjETmw7uaRKaUkPRPEgWgxibROg8o_rr8SxbnaVWtJbAT-t281f2ha4Aw$for a successful build of latest > > > petsc-3.20 [i.e release branch in git] with metis and parmetis > > > > > > > > Note the usage: > > > > > > > > >>>>> > > > > '--with-cc=cl', > > > > '--with-cxx=cl', > > > > '--with-fc=ifort', > > > > <<<< > > > > > > > > Satish > > > > > > > > On Tue, 19 Mar 2024, Barry Smith wrote: > > > > > > > > > Are you not able to use PETSc 3.?20.?2 ? On Mar 19, 2024, at 5:?27 AM, ?? wrote: Hi,Barry I try to use PETSc version 3.?19.?5 on windows, but it encounter a problem. > > > > > ********************************************************************************************* > > > > > ZjQcmQRYFpfptBannerStart > > > > > This Message Is From an External Sender > > > > > This message came from outside your organization. > > > > > ? > > > > > ZjQcmQRYFpfptBannerEnd > > > > > > > > > > ? Are you not able to use PETSc 3.20.2 ? > > > > > > > > > > On Mar 19, 2024, at 5:27?AM, ?? wrote: > > > > > > > > > > Hi,Barry > > > > > > > > > > I try to?use PETSc version?3.19.5 on windows, but it encounter a problem. > > > > > > > > > > > > > > > ?********************************************************************************************* > > > > > ? ? ? ? ? ?UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > > > > > --------------------------------------------------------------------------------------------- > > > > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Error configuring METIS with CMake > > > > > ********************************************************************************************* > > > > > > > > > > configure.log is attached. > > > > > > > > > > > > > > > Looking forward to your reply! > > > > > > > > > > sinserely, > > > > > > > > > > Ben. > > > > > > > > > > > > > > > > > > > > -----????----- > > > > > ???:?"Barry Smith" > > > > > ????:?2024-03-18 21:11:14 (???) > > > > > ???:??? <202321009113 at mail.scut.edu.cn> > > > > > ??:?petsc-users at mcs.anl.gov > > > > > ??:?Re: [petsc-users] Using PetscPartitioner on WINDOWS > > > > > > > > > > > > > > > Please switch to the latest PETSc version, it supports Metis and Parmetis on Windows. > > > > > ? Barry > > > > > > > > > > > > > > > On Mar 17, 2024, at 11:57?PM, ?? <202321009113 at mail.scut.edu.cn> wrote: > > > > > > > > > > This Message Is From an External Sender? > > > > > This message came from outside your organization. > > > > > > > > > > Hello? > > > > > > > > > > Recently I try to install PETSc with Cygwin since I'd like to use PETSc with Visual Studio on Windows10 plateform.For the sake of clarity, I firstly list the softwares/packages used below: > > > > > 1. PETSc: version 3.16.5 > > > > > 2. VS: version 2022? > > > > > 3. Intel MPI: download Intel oneAPI Base Toolkit and HPC Toolkit > > > > > 4. Cygwin > > > > > > > > > > > > > > > On windows, > > > > > Then I try to calculate a simple cantilever beam? that use Tetrahedral mesh.? So it's? unstructured grid > > > > > I use DMPlexCreateFromFile() to creat dmplex. > > > > > > > > > > And then I want to distributing the mesh for using? PETSCPARTITIONERPARMETIS type(in my opinion this PetscPartitioner type maybe the best for dmplex, > > > > > > > > > > see fig 1 for my work to see different PetscPartitioner type about a? cantilever beam in Linux system.) > > > > > > > > > > But unfortunatly, when i try to use parmetis on windows that configure PETSc as follows > > > > > > > > > > > > > > > ?./configure? --with-debugging=0? --with-cc='win32fe cl' --with-fc='win32fe ifort' --with-cxx='win32fe cl'?? > > > > > > > > > > --download-fblaslapack=/cygdrive/g/mypetsc/petsc-pkg-fblaslapack-e8a03f57d64c.tar.gz? --with-shared-libraries=0? > > > > > > > > > > --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include > > > > > ?--with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/lib/release/impi.lib? > > > > > --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec? > > > > > --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-475d8facbb32.tar.gz? > > > > > --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-ca7a59e6283f.tar.gz? > > > > > > > > > > > > > > > > > > > > > > > > > it shows that? > > > > > ******************************************************************************* > > > > > External package metis does not support --download-metis with Microsoft compilers > > > > > ******************************************************************************* > > > > > configure.log and?make.log is attached > > > > > > > > > > > > > > > > > > > > If I use PetscPartitioner Simple type the calculate time is much more than PETSCPARTITIONERPARMETIS type. > > > > > > > > > > So On windows system I want to use PetscPartitioner like parmetis , if there have any other PetscPartitioner type that can do the same work as parmetis,? > > > > > > > > > > or I just try to download parmetis? separatly on windows(like this website ,?https://urldefense.us/v3/__https://boogie.inm.ras.ru/terekhov/INMOST/-/wikis/0204-Compilation-ParMETIS-Windows__;!!G_uCfscf7eWS!YOO7nEnwU4BJQXD3WkP3QCvaT > >> 1gfLxBxnrNdXp9SJbjETmw7uaRKaUkPRPEgWgxibROg8o_rr8SxbnaVWtJbAT-t2815v60Yvg$)? > > > > > > > > > > and then use Visual Studio to use it's library I don't know in this way PETSc could use it successfully or not. > > > > > > > > > > > > > > > So I wrrit this email to report my problem and ask for your help. > > > > > > > > > > Looking forward your reply! > > > > > > > > > > > > > > > sinserely, > > > > > Ben. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From balay at mcs.anl.gov Thu Mar 21 09:49:28 2024 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 21 Mar 2024 09:49:28 -0500 (CDT) Subject: [petsc-users] Using PetscPartitioner on WINDOWS In-Reply-To: <0cd51025-6f7d-587c-b155-658b5fd790b9@mcs.anl.gov> References: <51939dcb.1af6.18e4fb57e91.Coremail.202321009113@mail.scut.edu.cn> <75a1141e.2227.18e56094a01.Coremail.ctchengben@mail.scut.edu.cn> <309691C2-F01F-4F1B-BB51-F4E1E5C49711@petsc.dev> <0727f33c-8771-baa7-376b-a5191c9d0f16@mcs.anl.gov> <8f1e656.7d8.18e617539b9.Coremail.ctchengben@mail.scut.edu.cn> <0cd51025-6f7d-587c-b155-658b5fd790b9@mcs.anl.gov> Message-ID: <3ec05215-510c-adb9-bedb-aa39907166fa@mcs.anl.gov> Checking for program /usr/bin/git...not found Checking for program /cygdrive/c/Users/Akun/AppData/Local/Programs/Git/bin/git...found Also you appear to not use 'cygwin git' I'm not sure if this alternative git would work or break - so either install/use cygwin git - or use tarballs. Satish On Thu, 21 Mar 2024, Satish Balay via petsc-users wrote: > Delete your old build files - and retry. i.e > > rm -rf /cygdrive/g/mypetsc/petsc-3.20.5/arch-mswin-c-opt > > ./configure .... > > Satish > > > On Thu, 21 Mar 2024, ?? wrote: > > > Hi, Satish Thanks for your reply, I try both way your said in petsc-3.?20.?5 but it encounter same question, ********************************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS > > ZjQcmQRYFpfptBannerStart > > This Message Is From an External Sender > > This message came from outside your organization. > > ? > > ZjQcmQRYFpfptBannerEnd > > > > Hi, Satish > > Thanks for your reply, I try both way your said in petsc-3.20.5 > > but it encounter same question, > > > > ********************************************************************************************* > > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > > --------------------------------------------------------------------------------------------- > > Error running make on METIS > > ********************************************************************************************* > > > > I send configure.log with "--download-parmetis --download-metis" to you and ask for you help. > > > > sinserely, > > Ben. > > > > > -----????----- > > > ???: "Satish Balay" > > > ????:2024-03-20 21:29:56 (???) > > > ???: ?? > > > ??: petsc-users > > > ??: Re: [petsc-users] Using PetscPartitioner on WINDOWS > > > > > > >>>> > > > Configure Options: --configModules=PETSc.Configure --optionsModule=config.compilerOptions --with-debugging=0 --with-cc=/cygdrive/g/mypetsc/petsc-3.20.2/lib/petsc/bin/win32fe/win_cl --with-fc=/cygdrive/g/mypetsc/petsc-3.20.2/lib/petsc/bin/win32fe/win_ifort --with-cxx=/cygdrive/g/mypetsc/p > > etsc-3.20.2/lib/petsc/bin/win32fe/win_cl --with-blaslapack-lib=-L/cygdrive/g/Intel/oneAPI/mkl/2023.2.0/lib/intel64 mkl-intel-lp64-dll.lib mkl-sequential-dll.lib mkl-core-dll.lib --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include --with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/20 > > 21.10.0/lib/release/impi.lib --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec -localonly --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-475d8facbb32.tar.gz --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-ca7a59e6283f.tar.gz --with-strict-petscerrorcode=0 > > > <<< > > > > > > >>>>>>>> > > > Warning: win32fe: File Not Found: /Ox > > > Error: win32fe: Input File Not Found: G:\mypetsc\PETSC-~2.2\ARCH-M~1\EXTERN~1\PETSC-~1\PETSC-~1\libmetis\/Ox > > > >>>>>>>>>> > > > > > > Looks like you are using an old snapshot of metis. Can you remove your local tarballs - and let [cygwin] git download the appropriate latest version? > > > > > > Or download and use: https://urldefense.us/v3/__https://bitbucket.org/petsc/pkg-metis/get/8b194fdf09661ac41b36fa16db0474d38f46f1ac.tar.gz__;!!G_uCfscf7eWS!cI7AqtwOwG0MWFBbcOA813z8p7Q2IZcdv53HvzHMxK37qmlicGatMh0ya5WWcEVfiYZ5JDmS7vfgveYi05xU_O8moU6wWwAikw$ > > > Similarly for parmetis https://urldefense.us/v3/__https://bitbucket.org/petsc/pkg-parmetis/get/f5e3aab04fd5fe6e09fa02f885c1c29d349f9f8b.tar.gz__;!!G_uCfscf7eWS!cI7AqtwOwG0MWFBbcOA813z8p7Q2IZcdv53HvzHMxK37qmlicGatMh0ya5WWcEVfiYZ5JDmS7vfgveYi05xU_O8moU4L6tLXtg$ > > > > > > Satish > > > > > > On Wed, 20 Mar 2024, ?? wrote: > > > > > > > Hi I try petsc-3.?20.?2 and petsc-3.?20.?5 with configure ./configure --with-debugging=0 --with-cc=cl --with-fc=ifort --with-cxx=cl --with-blaslapack-lib=-L/cygdrive/g/Intel/oneAPI/mkl/2023.?2.?0/lib/intel64 mkl-intel-lp64-dll.?lib > > > > mkl-sequential-dll.?lib > > > > ZjQcmQRYFpfptBannerStart > > > > This Message Is From an External Sender > > > > This message came from outside your organization. > > > > ? > > > > ZjQcmQRYFpfptBannerEnd > > > > > > > > Hi > > > > I try petsc-3.20.2 and petsc-3.20.5 with configure > > > > > > > > ./configure --with-debugging=0 --with-cc=cl --with-fc=ifort --with-cxx=cl > > > > --with-blaslapack-lib=-L/cygdrive/g/Intel/oneAPI/mkl/2023.2.0/lib/intel64 mkl-intel-lp64-dll.lib mkl-sequential-dll.lib mkl-core-dll.lib > > > > --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include --with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/lib/release/impi.lib > > > > --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec -localonly > > > > --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-475d8facbb32.tar.gz > > > > --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-ca7a59e6283f.tar.gz > > > > --with-strict-petscerrorcode=0 > > > > > > > > but it encounter same question, > > > > > > > > ********************************************************************************************* > > > > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > > > > --------------------------------------------------------------------------------------------- > > > > Error running make on METIS > > > > ********************************************************************************************* > > > > > > > > and I see https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/jobs/6412623047__;!!G_uCfscf7eWS!YOO7nEnwU4BJQXD3WkP3QCvaT1gfLxBxnrNdXp9SJbjETmw7uaRKaUkPRPEgWgxibROg8o_rr8SxbnaVWtJbAT-t281f2ha4Aw$for a successful build of lates > > > > t petsc-3.20 , it seem have something called "sowing" and "bison" , but I don't have. > > > > > > > > So I ask for your help, and configure.log is attached. > > > > > > > > sinserely, > > > > Ben. > > > > > > > > > -----????----- > > > > > ???: "Satish Balay" > > > > > ????:2024-03-20 00:48:57 (???) > > > > > ???: "Barry Smith" > > > > > ??: ?? , PETSc > > > > > ??: Re: [petsc-users] Using PetscPartitioner on WINDOWS > > > > > > > > > > Check https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/jobs/6412623047__;!!G_uCfscf7eWS!YOO7nEnwU4BJQXD3WkP3QCvaT1gfLxBxnrNdXp9SJbjETmw7uaRKaUkPRPEgWgxibROg8o_rr8SxbnaVWtJbAT-t281f2ha4Aw$for a successful build of latest > > > > petsc-3.20 [i.e release branch in git] with metis and parmetis > > > > > > > > > > Note the usage: > > > > > > > > > > >>>>> > > > > > '--with-cc=cl', > > > > > '--with-cxx=cl', > > > > > '--with-fc=ifort', > > > > > <<<< > > > > > > > > > > Satish > > > > > > > > > > On Tue, 19 Mar 2024, Barry Smith wrote: > > > > > > > > > > > Are you not able to use PETSc 3.?20.?2 ? On Mar 19, 2024, at 5:?27 AM, ?? wrote: Hi,Barry I try to use PETSc version 3.?19.?5 on windows, but it encounter a problem. > > > > > > ********************************************************************************************* > > > > > > ZjQcmQRYFpfptBannerStart > > > > > > This Message Is From an External Sender > > > > > > This message came from outside your organization. > > > > > > ? > > > > > > ZjQcmQRYFpfptBannerEnd > > > > > > > > > > > > ? Are you not able to use PETSc 3.20.2 ? > > > > > > > > > > > > On Mar 19, 2024, at 5:27?AM, ?? wrote: > > > > > > > > > > > > Hi,Barry > > > > > > > > > > > > I try to?use PETSc version?3.19.5 on windows, but it encounter a problem. > > > > > > > > > > > > > > > > > > ?********************************************************************************************* > > > > > > ? ? ? ? ? ?UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > > > > > > --------------------------------------------------------------------------------------------- > > > > > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Error configuring METIS with CMake > > > > > > ********************************************************************************************* > > > > > > > > > > > > configure.log is attached. > > > > > > > > > > > > > > > > > > Looking forward to your reply! > > > > > > > > > > > > sinserely, > > > > > > > > > > > > Ben. > > > > > > > > > > > > > > > > > > > > > > > > -----????----- > > > > > > ???:?"Barry Smith" > > > > > > ????:?2024-03-18 21:11:14 (???) > > > > > > ???:??? <202321009113 at mail.scut.edu.cn> > > > > > > ??:?petsc-users at mcs.anl.gov > > > > > > ??:?Re: [petsc-users] Using PetscPartitioner on WINDOWS > > > > > > > > > > > > > > > > > > Please switch to the latest PETSc version, it supports Metis and Parmetis on Windows. > > > > > > ? Barry > > > > > > > > > > > > > > > > > > On Mar 17, 2024, at 11:57?PM, ?? <202321009113 at mail.scut.edu.cn> wrote: > > > > > > > > > > > > This Message Is From an External Sender? > > > > > > This message came from outside your organization. > > > > > > > > > > > > Hello? > > > > > > > > > > > > Recently I try to install PETSc with Cygwin since I'd like to use PETSc with Visual Studio on Windows10 plateform.For the sake of clarity, I firstly list the softwares/packages used below: > > > > > > 1. PETSc: version 3.16.5 > > > > > > 2. VS: version 2022? > > > > > > 3. Intel MPI: download Intel oneAPI Base Toolkit and HPC Toolkit > > > > > > 4. Cygwin > > > > > > > > > > > > > > > > > > On windows, > > > > > > Then I try to calculate a simple cantilever beam? that use Tetrahedral mesh.? So it's? unstructured grid > > > > > > I use DMPlexCreateFromFile() to creat dmplex. > > > > > > > > > > > > And then I want to distributing the mesh for using? PETSCPARTITIONERPARMETIS type(in my opinion this PetscPartitioner type maybe the best for dmplex, > > > > > > > > > > > > see fig 1 for my work to see different PetscPartitioner type about a? cantilever beam in Linux system.) > > > > > > > > > > > > But unfortunatly, when i try to use parmetis on windows that configure PETSc as follows > > > > > > > > > > > > > > > > > > ?./configure? --with-debugging=0? --with-cc='win32fe cl' --with-fc='win32fe ifort' --with-cxx='win32fe cl'?? > > > > > > > > > > > > --download-fblaslapack=/cygdrive/g/mypetsc/petsc-pkg-fblaslapack-e8a03f57d64c.tar.gz? --with-shared-libraries=0? > > > > > > > > > > > > --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include > > > > > > ?--with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/lib/release/impi.lib? > > > > > > --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec? > > > > > > --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-475d8facbb32.tar.gz? > > > > > > --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-ca7a59e6283f.tar.gz? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > it shows that? > > > > > > ******************************************************************************* > > > > > > External package metis does not support --download-metis with Microsoft compilers > > > > > > ******************************************************************************* > > > > > > configure.log and?make.log is attached > > > > > > > > > > > > > > > > > > > > > > > > If I use PetscPartitioner Simple type the calculate time is much more than PETSCPARTITIONERPARMETIS type. > > > > > > > > > > > > So On windows system I want to use PetscPartitioner like parmetis , if there have any other PetscPartitioner type that can do the same work as parmetis,? > > > > > > > > > > > > or I just try to download parmetis? separatly on windows(like this website ,?https://urldefense.us/v3/__https://boogie.inm.ras.ru/terekhov/INMOST/-/wikis/0204-Compilation-ParMETIS-Windows__;!!G_uCfscf7eWS!YOO7nEnwU4BJQXD3WkP3QCvaT > > >> 1gfLxBxnrNdXp9SJbjETmw7uaRKaUkPRPEgWgxibROg8o_rr8SxbnaVWtJbAT-t2815v60Yvg$)? > > > > > > > > > > > > and then use Visual Studio to use it's library I don't know in this way PETSc could use it successfully or not. > > > > > > > > > > > > > > > > > > So I wrrit this email to report my problem and ask for your help. > > > > > > > > > > > > Looking forward your reply! > > > > > > > > > > > > > > > > > > sinserely, > > > > > > Ben. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From martin.diehl at kuleuven.be Thu Mar 21 11:21:44 2024 From: martin.diehl at kuleuven.be (Martin Diehl) Date: Thu, 21 Mar 2024 16:21:44 +0000 Subject: [petsc-users] Fortran interfaces: Google Summer of Code project? Message-ID: <8a1af6d55557ed5f1b98a5e7edcf0bc4f6b175a7.camel@kuleuven.be> Dear PETSc team, I've worked on Fortran interfaces (see https://gitlab.com/petsc/petsc/-/issues/1540) but could not get far in the time I could afford. In discussion with Javier (in CC) the idea came up to propose to offer the work on Fortran interfaces for PETSc as a Google Summer of Code project. fortran-lang has been accepted as organization and the current projects are on: https://github.com/fortran-lang/webpage/wiki/GSoC-2024-Project-ideas The main work would be the automatization of interfaces that are currently manually created via Python. This includes an improved user experience, because correct variable names (not a, b, c) can be used. It should be also possible to automatically create descriptions of the enumerators. As outlook tasks, I would propose: - check whether a unified automatization script can also replace the current tool for creation of interfaces. - investigate improved handling of strings (there are ways in newer standards). I can offer to do the supervision, but would certainly need guidance and the ok from the PETSc core team. best regards, Martin -- KU Leuven Department of Computer Science Department of Materials Engineering Celestijnenlaan 200a 3001 Leuven, Belgium -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 659 bytes Desc: This is a digitally signed message part URL: From bsmith at petsc.dev Thu Mar 21 12:19:02 2024 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 21 Mar 2024 13:19:02 -0400 Subject: [petsc-users] Fortran interfaces: Google Summer of Code project? In-Reply-To: <8a1af6d55557ed5f1b98a5e7edcf0bc4f6b175a7.camel@kuleuven.be> References: <8a1af6d55557ed5f1b98a5e7edcf0bc4f6b175a7.camel@kuleuven.be> Message-ID: An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu Mar 21 16:19:31 2024 From: jed at jedbrown.org (Jed Brown) Date: Thu, 21 Mar 2024 15:19:31 -0600 Subject: [petsc-users] Fortran interfaces: Google Summer of Code project? In-Reply-To: References: <8a1af6d55557ed5f1b98a5e7edcf0bc4f6b175a7.camel@kuleuven.be> Message-ID: <87frwjffks.fsf@jedbrown.org> An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Mar 21 16:29:29 2024 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 21 Mar 2024 17:29:29 -0400 Subject: [petsc-users] Fortran interfaces: Google Summer of Code project? In-Reply-To: <87frwjffks.fsf@jedbrown.org> References: <8a1af6d55557ed5f1b98a5e7edcf0bc4f6b175a7.camel@kuleuven.be> <87frwjffks.fsf@jedbrown.org> Message-ID: <3C548E40-4FFF-47AB-8AFE-D32CC2191B28@petsc.dev> An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Mar 21 16:29:29 2024 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 21 Mar 2024 17:29:29 -0400 Subject: [petsc-users] Fortran interfaces: Google Summer of Code project? In-Reply-To: <87frwjffks.fsf@jedbrown.org> References: <8a1af6d55557ed5f1b98a5e7edcf0bc4f6b175a7.camel@kuleuven.be> <87frwjffks.fsf@jedbrown.org> Message-ID: <3C548E40-4FFF-47AB-8AFE-D32CC2191B28@petsc.dev> An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu Mar 21 17:35:12 2024 From: jed at jedbrown.org (Jed Brown) Date: Thu, 21 Mar 2024 16:35:12 -0600 Subject: [petsc-users] Fortran interfaces: Google Summer of Code project? In-Reply-To: <3C548E40-4FFF-47AB-8AFE-D32CC2191B28@petsc.dev> References: <8a1af6d55557ed5f1b98a5e7edcf0bc4f6b175a7.camel@kuleuven.be> <87frwjffks.fsf@jedbrown.org> <3C548E40-4FFF-47AB-8AFE-D32CC2191B28@petsc.dev> Message-ID: <877chvfc2n.fsf@jedbrown.org> An HTML attachment was scrubbed... URL: From martin.diehl at kuleuven.be Thu Mar 21 17:49:35 2024 From: martin.diehl at kuleuven.be (Martin Diehl) Date: Thu, 21 Mar 2024 22:49:35 +0000 Subject: [petsc-users] Fortran interfaces: Google Summer of Code project? In-Reply-To: References: <8a1af6d55557ed5f1b98a5e7edcf0bc4f6b175a7.camel@kuleuven.be> Message-ID: Dear Barry, all, pls find my comments below. On Thu, 2024-03-21 at 13:19 -0400, Barry Smith wrote: > > ?? Martin, > > ??? Thanks for the suggestions and offer. > > ??? The tool we use for automatically generating the Fortran stubs > and interfaces is bfort. > > ???? Its limitations include that it cannot handle string arguments > automatically and cannot generate more than one interface for a > function. This is why we need to provide these manually (the use of > a,b,... is to prevent long lines and the need for continuations in > the definitions of the interfaces). This should be fixable: Either tell the compiler to accept long lines or introduce line breaks if needed. > > ???? Adding support for strings is very straightforward, just a > little more smarts in bfort. perfect. If I remember correctly, a lot of interfaces that I contributed where needed because of strings. > > ???? Adding support for multiple interface generation is a bit > trickier because the code must (based on the C calling sequence) > automatically determine all the combinations of array vs single value > the interfaces should generate and then generate a Fortran stub for > each (all mapping back to the same master stub for that function). > I've talked to Bill Gropp about having him add such support, but he > simply does not have time for such work so most recent work on the > bfort that PETSc uses has been by us. I don't have the time either, hence the idea to search for someone paid by google. > > ???? We've always had some tension between adding new features to > bfort vs developing an entirely new tool (for example in Python > (maybe calling a little LLVM to help parse the C function), for maybe > there is already a tool out there) to replace bfort. Both approaches > have their advantages and disadvantages instead we've relied on the > quick and dirty of providing the interfaces as needed). We have not > needed the Fortran standard C interface stuff and I would prefer not > to use it unless it offers some huge advantage). both approaches (improving bfort or writing something new) are fine with me. Regarding the Fortran standard interfacing: Are your main concern the use of ISO_Fortran_binding.h on the C side? (see https://fortran-lang.discourse.group/t/examples-iso-c-binding-calling-fortran-from-c/4149/3 ). Other language features (like the '*' to indicate 'any type') on the Fortran side are already used as far as I know. > > ??? Thoughts? > > ?? Barry > > > > ???? > > > > > On Mar 21, 2024, at 12:21?PM, Martin Diehl > > wrote: > > > > Dear PETSc team, > > > > I've worked on Fortran interfaces (see > > https://gitlab.com/petsc/petsc/-/issues/1540) but could not get far > > in > > the time I could afford. > > > > In discussion with Javier (in CC) the idea came up to propose to > > offer > > the work on Fortran interfaces for PETSc as a Google Summer of Code > > project. > > > > fortran-lang has been accepted as organization and the current > > projects > > are on: > > https://github.com/fortran-lang/webpage/wiki/GSoC-2024-Project-ideas > > > > The main work would be the automatization of interfaces that are > > currently manually created via Python. This includes an improved > > user > > experience, because correct variable names (not a, b, c) can be > > used. > > It should be also possible to automatically create descriptions of > > the > > enumerators. > > > > As outlook tasks, I would propose: > > - check whether a unified automatization script can also replace > > the > > current tool for creation of interfaces. > > - investigate improved handling of strings (there are ways in newer > > standards). > > > > I can offer to do the supervision, but would certainly need > > guidance > > and the ok from the PETSc core team. > > > > best regards, > > Martin > > > > -- > > KU Leuven > > Department of Computer Science > > Department of Materials Engineering > > Celestijnenlaan 200a > > 3001 Leuven, Belgium > -- KU Leuven Department of Computer Science Department of Materials Engineering Celestijnenlaan 200a 3001 Leuven, Belgium -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 659 bytes Desc: This is a digitally signed message part URL: From martin.diehl at kuleuven.be Thu Mar 21 18:06:39 2024 From: martin.diehl at kuleuven.be (Martin Diehl) Date: Thu, 21 Mar 2024 23:06:39 +0000 Subject: [petsc-users] Fortran interfaces: Google Summer of Code project? In-Reply-To: <877chvfc2n.fsf@jedbrown.org> References: <8a1af6d55557ed5f1b98a5e7edcf0bc4f6b175a7.camel@kuleuven.be> <87frwjffks.fsf@jedbrown.org> <3C548E40-4FFF-47AB-8AFE-D32CC2191B28@petsc.dev> <877chvfc2n.fsf@jedbrown.org> Message-ID: On Thu, 2024-03-21 at 16:35 -0600, Jed Brown wrote: > Barry Smith writes: > > > In my limited understanding of the Fortran iso_c_binding, if we do > > not provide an equivalent Fortran stub (the user calls) that uses > > the iso_c_binding to call PETSc C code, then when the user calls > > PETSc C code directly via the iso_c_binding they have to pass > > iso_c_binding type arguments to the call. This I consider > > unacceptable. So my conclusion was there is the same number of > > stubs, just in a different language, so there is no reason to > > consider changing since we cannot "delete lots of stubs", but I > > could be wrong. > > I don't want users to deal with iso_c_binding manually. > > We already have the generated ftn-auto-interfaces/*.h90. The > INTERFACE keyword could be replaced with CONTAINS (making these > definitions instead of just interfaces), and then the bodies could > use iso_c_binding to call the C functions. That would reduce > repetition and be the standards-compliant way to do this. What we do > now with detecting the Fortran mangling scheme and calling > conventions "works" but doesn't conform to any standard and there's > nothing stopping Fortran implementations from creating yet another > variant that we have to deal with manually. I think for "easy" functions, interfaces are sufficient. The only change I propose with my current knowledge would be the use of "bind(C)" on the Fortran side to get rid of the preprocessor statements for name mangling. For complicated functions, e.g. those having a string argument, I currently use an interface that tells Fortran how the C function looks like and a function definition that translates the Fortran string to a C string. > > I don't know if this change would enable inlining without LTO, though > I think the indirection through our C sourcef.c is rarely a > performance factor for Fortran users. From the discussion, I conclude that there is general interest in the topic and I would suggest that I go ahead and propose the topic. The first task would then the comparison of different approaches for the automated generation of interfaces -- KU Leuven Department of Computer Science Department of Materials Engineering Celestijnenlaan 200a 3001 Leuven, Belgium -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 659 bytes Desc: This is a digitally signed message part URL: From bsmith at petsc.dev Thu Mar 21 22:55:22 2024 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 21 Mar 2024 23:55:22 -0400 Subject: [petsc-users] Fortran interfaces: Google Summer of Code project? In-Reply-To: <877chvfc2n.fsf@jedbrown.org> References: <8a1af6d55557ed5f1b98a5e7edcf0bc4f6b175a7.camel@kuleuven.be> <87frwjffks.fsf@jedbrown.org> <3C548E40-4FFF-47AB-8AFE-D32CC2191B28@petsc.dev> <877chvfc2n.fsf@jedbrown.org> Message-ID: <4D90CF28-E7F9-4F3B-B081-C8189899C34E@petsc.dev> > On Mar 21, 2024, at 6:35?PM, Jed Brown wrote: > > Barry Smith writes: > >> In my limited understanding of the Fortran iso_c_binding, if we do not provide an equivalent Fortran stub (the user calls) that uses the iso_c_binding to call PETSc C code, then when the user calls PETSc C code directly via the iso_c_binding they have to pass iso_c_binding type arguments to the call. This I consider unacceptable. So my conclusion was there is the same number of stubs, just in a different language, so there is no reason to consider changing since we cannot "delete lots of stubs", but I could be wrong. > > I don't want users to deal with iso_c_binding manually. > > We already have the generated ftn-auto-interfaces/*.h90. The INTERFACE keyword could be replaced with CONTAINS (making these definitions instead of just interfaces), and then the bodies could use iso_c_binding to call the C functions. That would reduce repetition and be the standards-compliant way to do this. Sure, the interface and the stub go in the same file instead of two files. This is slightly nicer but not significantly simpler, and alone, it is not reason enough to write an entire new stub generator. > What we do now with detecting the Fortran mangling scheme and calling conventions "works" but doesn't conform to any standard and there's nothing stopping Fortran implementations from creating yet another variant that we have to deal with manually. From practical experience, calling C/Fortran using non-standards has only gotten easier over the last thirty-five years (fewer variants on how char* is handled); it has not gotten more complicated, so I submit the likelihood of "nothing stopping Fortran implementations from creating yet another variant that we have to deal with manually" is (though possible) rather unlikely. As far as I am concerned, much of iso_c_binding stuff just solved a problem that never really existed (except in some people's minds) since calling C/Fortran has always been easy, modulo knowing a tiny bit of information.. > I don't know if this change would enable inlining without LTO, though I think the indirection through our C sourcef.c is rarely a performance factor for Fortran users. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Fri Mar 22 00:13:40 2024 From: jed at jedbrown.org (Jed Brown) Date: Thu, 21 Mar 2024 23:13:40 -0600 Subject: [petsc-users] Fortran interfaces: Google Summer of Code project? In-Reply-To: <4D90CF28-E7F9-4F3B-B081-C8189899C34E@petsc.dev> References: <8a1af6d55557ed5f1b98a5e7edcf0bc4f6b175a7.camel@kuleuven.be> <87frwjffks.fsf@jedbrown.org> <3C548E40-4FFF-47AB-8AFE-D32CC2191B28@petsc.dev> <877chvfc2n.fsf@jedbrown.org> <4D90CF28-E7F9-4F3B-B081-C8189899C34E@petsc.dev> Message-ID: <87edc2c0hn.fsf@jedbrown.org> An HTML attachment was scrubbed... URL: From martin.diehl at kuleuven.be Fri Mar 22 06:03:00 2024 From: martin.diehl at kuleuven.be (Martin Diehl) Date: Fri, 22 Mar 2024 11:03:00 +0000 Subject: [petsc-users] Fortran interfaces: Google Summer of Code project? In-Reply-To: <87edc2c0hn.fsf@jedbrown.org> References: <8a1af6d55557ed5f1b98a5e7edcf0bc4f6b175a7.camel@kuleuven.be> <87frwjffks.fsf@jedbrown.org> <3C548E40-4FFF-47AB-8AFE-D32CC2191B28@petsc.dev> <877chvfc2n.fsf@jedbrown.org> <4D90CF28-E7F9-4F3B-B081-C8189899C34E@petsc.dev> <87edc2c0hn.fsf@jedbrown.org> Message-ID: <8621185c1419b35a1cbc7f5e2bc269eebd1b84ab.camel@kuleuven.be> Dear All, pls. find attached the proposal for https://github.com/fortran-lang/webpage/wiki/GSoC-2024-Project-ideas. I tried to keep it general such that we keep all options open. Let me know if you want to change/add/remove anything and/or if you want to be listed as mentor. Since I've mixed up the deadline, the most urgent task is the finding of suitable candidates. Once it's online, I'll post on linkedin but ideally we can motivate someone who is already known. best regards, Martin On Thu, 2024-03-21 at 23:13 -0600, Jed Brown wrote: > Barry Smith writes: > > > > We already have the generated ftn-auto-interfaces/*.h90. The > > > INTERFACE keyword could be replaced with CONTAINS (making these > > > definitions instead of just interfaces), and then the bodies > > > could use iso_c_binding to call the C functions. That would > > > reduce repetition and be the standards-compliant way to do this. > > > > ?? Sure, the interface and the stub go in the same file instead of > > two files. This is slightly nicer but not significantly simpler, > > and alone, it is not reason enough to write an entire new stub > > generator. > > I agree, but if one *is* writing a new stub generator for good > reasons (like better automation/completeness), there's a case for > doing it this way unless users really need an environment in which > that feature can't be used. > > > > What we do now with detecting the Fortran mangling scheme and > > > calling conventions "works" but doesn't conform to any standard > > > and there's nothing stopping Fortran implementations from > > > creating yet another variant that we have to deal with manually. > > > > ?? From practical experience, calling C/Fortran using non-standards > > has only gotten easier over the last thirty-five years (fewer > > variants on how char* is handled); it has not gotten more > > complicated, so I submit the likelihood of "nothing stopping > > Fortran implementations from creating yet another variant that we > > have to deal with manually" is (though possible) rather unlikely. > > As far as I am concerned, much of iso_c_binding stuff just solved a > > problem that never really existed (except in some people's minds) > > since calling C/Fortran has always been easy, modulo knowing a tiny > > bit of information.. > > An examples for concreteness: > > https://fortranwiki.org/fortran/show/Generating+C+Interfaces > > And discussion: > > https://fortran-lang.discourse.group/t/iso-c-binding-looking-for-practical-example-of-how-it-helps-with-mangling/3393/8 > > With this approach, one could even use method syntax like > ksp%SetOperators(J, Jpre), as in the nlopt-f project linked in the > top of this question. I don't know if we want that (it would be a > huge change for users, albeit it "easy"), but generating stubs in > Fortran using iso_c_binding opens a variety of possibilities for more > idiomatic bindings. -- KU Leuven Department of Computer Science Department of Materials Engineering Celestijnenlaan 200a 3001 Leuven, Belgium -------------- next part -------------- A non-text attachment was scrubbed... Name: PETSc.md Type: text/markdown Size: 2570 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PETSc.pdf Type: application/pdf Size: 89431 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 659 bytes Desc: This is a digitally signed message part URL: From yc17470 at connect.um.edu.mo Sat Mar 23 07:03:24 2024 From: yc17470 at connect.um.edu.mo (Gong Yujie) Date: Sat, 23 Mar 2024 12:03:24 +0000 Subject: [petsc-users] Question about how to use DS to do the discretization Message-ID: Dear PETSc group, I'm reading the DS part for the discretization start from SNES ex17.c which is a demo for solving linear elasticity problem. I have two questions for the details. The first question is for the residual function. Is the residual calculated as this? The dot product is a little weird because of the dimension of the result. [cid:b790e49a-5adb-4085-807c-546e7c2d941f] Here \sigma is the stress tensor, \phi_i is the test function for the i-th function (Linear elasticity in 3D contains three equations). The second question is how to derive the Jacobian of the system (line 330 in ex17.c). As shown in the PetscDSSetJacobian, we need to provide function g3() which I think is a 4th-order tensor with size 3*3*3*3 in this linear elasticity case. I'm not sure how to get it. Are there any references on how to get this Jacobian? I've checked about the comment before this Jacobian function (line 330 in ex17.c) but don't know how to get this. Thanks in advance! Best Regards, Yujie -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 34587 bytes Desc: image.png URL: From stefano.zampini at gmail.com Sat Mar 23 07:55:39 2024 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Sat, 23 Mar 2024 15:55:39 +0300 Subject: [petsc-users] Question about how to use DS to do the discretization In-Reply-To: References: Message-ID: Take a look at https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/blob/main/src/snes/tutorials/ex11.c?ref_type=heads__;!!G_uCfscf7eWS!er4CI8GIe7OCWvCmRKQpZt6FOz1QYvbuZOdf2Fm7pvMGee3I9M5bhjNytv42F9C17NpBy0i6mTfgEmQfUR_QOqwC7gC6pYk$ and the discussion at the beginning (including the reference to the original paper) On Sat, Mar 23, 2024, 15:03 Gong Yujie wrote: > Dear PETSc group, I'm reading the DS part for the discretization start > from SNES ex17. c which is a demo for solving linear elasticity problem. I > have two questions for the details. The first question is for the residual > function. Is the residual > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > Dear PETSc group, > > I'm reading the DS part for the discretization start from SNES ex17.c > which is a demo for solving linear elasticity problem. I have two questions > for the details. > > The first question is for the residual function. Is the residual > calculated as this? The dot product is a little weird because of the > dimension of the result. > Here \sigma is the stress tensor, \phi_i is the test function for the i-th > function (Linear elasticity in 3D contains three equations). > > The second question is how to derive the Jacobian of the system (line 330 > in ex17.c). As shown in the PetscDSSetJacobian, we need to provide function > g3() which I think is a 4th-order tensor with size 3*3*3*3 in this linear > elasticity case. I'm not sure how to get it. Are there any references on > how to get this Jacobian? > > I've checked about the comment before this Jacobian function (line 330 in > ex17.c) but don't know how to get this. > > Thanks in advance! > > Best Regards, > Yujie > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 34587 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 34587 bytes Desc: not available URL: From knepley at gmail.com Sat Mar 23 12:43:04 2024 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 23 Mar 2024 13:43:04 -0400 Subject: [petsc-users] Question about how to use DS to do the discretization In-Reply-To: References: Message-ID: On Sat, Mar 23, 2024 at 8:03?AM Gong Yujie wrote: > Dear PETSc group, I'm reading the DS part for the discretization start > from SNES ex17. c which is a demo for solving linear elasticity problem. I > have two questions for the details. The first question is for the residual > function. Is the residual > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > Dear PETSc group, > > I'm reading the DS part for the discretization start from SNES ex17.c > which is a demo for solving linear elasticity problem. I have two questions > for the details. > > The first question is for the residual function. Is the residual > calculated as this? The dot product is a little weird because of the > dimension of the result. > Here \sigma is the stress tensor, \phi_i is the test function for the i-th > function (Linear elasticity in 3D contains three equations). > The stress term in the momentum equation is (-div sigma) . psi = d_i sigma_{ij} psi_j which is then integrated by parts sigma_{ij} d_i psi_j This is linear isotropic elasticity, so sigma = \mu (d_i u_j + d_j u_i) + \lambda \delta_{ij} sigma_{kk} In PETSc, we phrase the term in the weak form as f^1_{ij} d_i psi_j so f1[i * dim + j] below is sigma_{ij}, from line 298 of ex17.c for (PetscInt c = 0; c < Nc; ++c) { for (PetscInt d = 0; d < dim; ++d) { f1[c * dim + d] += mu * (u_x[c * dim + d] + u_x[d * dim + c]); f1[c * dim + c] += lambda * u_x[d * dim + d]; } } > The second question is how to derive the Jacobian of the system (line 330 > in ex17.c). As shown in the PetscDSSetJacobian, we need to provide function > g3() which I think is a 4th-order tensor with size 3*3*3*3 in this linear > elasticity case. I'm not sure how to get it. Are there any references on > how to get this Jacobian? > The Jacobian indices are shown here: https://urldefense.us/v3/__https://petsc.org/main/manualpages/FE/PetscFEIntegrateJacobian/__;!!G_uCfscf7eWS!eNdXZ9QpzXPTMCtLneeS7td5OYVeKmhdOrw_fswDze2u7v2JaO7kBmFTVakyHDmWJOCHjf-0GrqGc53QKQec$ where the g3 term is \nabla\psi^{fc}_f(q) \cdot g3_{fc,gc,df,dg}(u, \nabla u) \nabla\phi^{gc}_g(q) To get the Jacobian, we use u = \sum_i u_i psi_i, where psi_i is a vector, and then differentiate the expression with respect to the coefficient u_i. Since the operator is already linear,this is just matching indices for (PetscInt c = 0; c < Nc; ++c) { for (PetscInt d = 0; d < dim; ++d) { g3[((c * Nc + c) * dim + d) * dim + d] += mu; g3[((c * Nc + d) * dim + d) * dim + c] += mu; g3[((c * Nc + d) * dim + c) * dim + d] += lambda; } } Take the first mu term, mu (d_c u_d ) (d_c psi_d). We know that fc == gc and df == dg, so we get g3[((c * Nc + c) * dim + d) * dim + d] += mu; For the second term, we transpose grad u, so fc == dg and gc == df, g3[((c * Nc + d) * dim + d) * dim + c] += mu; Finally, for the lambda term, fc == df and gc == dg because we are matching terms in the sum g3[((c * Nc + d) * dim + c) * dim + d] += lambda; Thanks, Matt > I've checked about the comment before this Jacobian function (line 330 in > ex17.c) but don't know how to get this. > > Thanks in advance! > > Best Regards, > Yujie > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!eNdXZ9QpzXPTMCtLneeS7td5OYVeKmhdOrw_fswDze2u7v2JaO7kBmFTVakyHDmWJOCHjf-0GrqGc4DkrV1c$ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 34587 bytes Desc: not available URL: From jed at jedbrown.org Sat Mar 23 22:46:39 2024 From: jed at jedbrown.org (Jed Brown) Date: Sat, 23 Mar 2024 21:46:39 -0600 Subject: [petsc-users] Fortran interfaces: Google Summer of Code project? In-Reply-To: <8621185c1419b35a1cbc7f5e2bc269eebd1b84ab.camel@kuleuven.be> References: <8a1af6d55557ed5f1b98a5e7edcf0bc4f6b175a7.camel@kuleuven.be> <87frwjffks.fsf@jedbrown.org> <3C548E40-4FFF-47AB-8AFE-D32CC2191B28@petsc.dev> <877chvfc2n.fsf@jedbrown.org> <4D90CF28-E7F9-4F3B-B081-C8189899C34E@petsc.dev> <87edc2c0hn.fsf@jedbrown.org> <8621185c1419b35a1cbc7f5e2bc269eebd1b84ab.camel@kuleuven.be> Message-ID: <878r28s34w.fsf@jedbrown.org> An HTML attachment was scrubbed... URL: From ctchengben at mail.scut.edu.cn Mon Mar 25 02:40:06 2024 From: ctchengben at mail.scut.edu.cn (=?UTF-8?B?56iL5aWU?=) Date: Mon, 25 Mar 2024 15:40:06 +0800 (GMT+08:00) Subject: [petsc-users] Using PetscPartitioner on WINDOWS In-Reply-To: <3ec05215-510c-adb9-bedb-aa39907166fa@mcs.anl.gov> References: <51939dcb.1af6.18e4fb57e91.Coremail.202321009113@mail.scut.edu.cn> <75a1141e.2227.18e56094a01.Coremail.ctchengben@mail.scut.edu.cn> <309691C2-F01F-4F1B-BB51-F4E1E5C49711@petsc.dev> <0727f33c-8771-baa7-376b-a5191c9d0f16@mcs.anl.gov> <8f1e656.7d8.18e617539b9.Coremail.ctchengben@mail.scut.edu.cn> <0cd51025-6f7d-587c-b155-658b5fd790b9@mcs.anl.gov> <3ec05215-510c-adb9-bedb-aa39907166fa@mcs.anl.gov> Message-ID: <104f58e4.10f7.18e748d637f.Coremail.ctchengben@mail.scut.edu.cn> An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Mar 25 09:24:44 2024 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 25 Mar 2024 10:24:44 -0400 Subject: [petsc-users] Using PetscPartitioner on WINDOWS In-Reply-To: <104f58e4.10f7.18e748d637f.Coremail.ctchengben@mail.scut.edu.cn> References: <51939dcb.1af6.18e4fb57e91.Coremail.202321009113@mail.scut.edu.cn> <75a1141e.2227.18e56094a01.Coremail.ctchengben@mail.scut.edu.cn> <309691C2-F01F-4F1B-BB51-F4E1E5C49711@petsc.dev> <0727f33c-8771-baa7-376b-a5191c9d0f16@mcs.anl.gov> <8f1e656.7d8.18e617539b9.Coremail.ctchengben@mail.scut.edu.cn> <0cd51025-6f7d-587c-b155-658b5fd790b9@mcs.anl.gov> <3ec05215-510c-adb9-bedb-aa39907166fa@mcs.anl.gov> <104f58e4.10f7.18e748d637f.Coremail.ctchengben@mail.scut.edu.cn> Message-ID: --download-fblaslapack=/cygdrive/g/mypetsc/petsc-pkg-fblaslapack-e8a03f57d64c.tar.gz > On Mar 25, 2024, at 3:40?AM, ?? wrote: > > This Message Is From an External Sender > This message came from outside your organization. > Hi, Satish > Thanks for your reply, I try your advise and I successfully configure and make petsc-3.20.5 with libmetis.lib and libparmetis.lib > Configure Options: --with-debugging=0 > --with-cc=/cygdrive/g/mypetsc/petsc-3.20.2/lib/petsc/bin/win32fe/win_cl > --with-fc=/cygdrive/g/mypetsc/petsc-3.20.2/lib/petsc/bin/win32fe/win_ifort > --with-cxx=/cygdrive/g/mypetsc/petsc-3.20.2/lib/petsc/bin/win32fe/win_cl > --with-blaslapack-lib=-L/cygdrive/g/Intel/oneAPI/mkl/2023.2.0/lib/intel64 mkl-intel-lp64-dll.lib mkl-sequential-dll.lib mkl-core-dll.lib > --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include > --with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/lib/release/impi.lib > --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec -localonly > --download-parmetis --download-metis --with-strict-petscerrorcode=0 > > > but when I try to use --download-blaslapack=/cygdrive/g/mypetsc/petsc-pkg-fblaslapack-e8a03f57d64c.tar.gz > It seem not work > > ********************************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > --------------------------------------------------------------------------------------------- > External package BlasLapack does not support --download-blaslapack > ********************************************************************************************* > configure.log is attached. > > so If it mean we must use Intel MKL of blaslapack that for download metis and parmetis ( In this case we don't have libblaslapack.lib ) > and When I try to use PETSc on windows with Visual studio I should use mkl-intel-lp64-dll.lib mkl-sequential-dll.lib mkl-core-dll.lib > > > sinserely, > Ben. > > > > > > -----????----- > > ???: "Satish Balay" > > > ????:2024-03-21 22:49:28 (???) > > ???: "Satish Balay via petsc-users" > > > ??: ?? > > > ??: Re: [petsc-users] Using PetscPartitioner on WINDOWS > > > > Checking for program /usr/bin/git...not found > > Checking for program /cygdrive/c/Users/Akun/AppData/Local/Programs/Git/bin/git...found > > > > Also you appear to not use 'cygwin git' I'm not sure if this alternative git would work or break - so either install/use cygwin git - or use tarballs. > > > > Satish > > > > On Thu, 21 Mar 2024, Satish Balay via petsc-users wrote: > > > > > Delete your old build files - and retry. i.e > > > > > > rm -rf /cygdrive/g/mypetsc/petsc-3.20.5/arch-mswin-c-opt > > > > > > ./configure .... > > > > > > Satish > > > > > > > > > On Thu, 21 Mar 2024, ?? wrote: > > > > > > > Hi, Satish Thanks for your reply, I try both way your said in petsc-3.?20.?5 but it encounter same question, ********************************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS > > > > ZjQcmQRYFpfptBannerStart > > > > This Message Is From an External Sender > > > > This message came from outside your organization. > > > > > > > > ZjQcmQRYFpfptBannerEnd > > > > > > > > Hi, Satish > > > > Thanks for your reply, I try both way your said in petsc-3.20.5 > > > > but it encounter same question, > > > > > > > > ********************************************************************************************* > > > > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > > > > --------------------------------------------------------------------------------------------- > > > > Error running make on METIS > > > > ********************************************************************************************* > > > > > > > > I send configure.log with "--download-parmetis --download-metis" to you and ask for you help. > > > > > > > > sinserely, > > > > Ben. > > > > > > > > > -----????----- > > > > > ???: "Satish Balay" > > > > > > ????:2024-03-20 21:29:56 (???) > > > > > ???: ?? > > > > > > ??: petsc-users > > > > > > ??: Re: [petsc-users] Using PetscPartitioner on WINDOWS > > > > > > > > > > >>>> > > > > > Configure Options: --configModules=PETSc.Configure --optionsModule=config.compilerOptions --with-debugging=0 --with-cc=/cygdrive/g/mypetsc/petsc-3.20.2/lib/petsc/bin/win32fe/win_cl --with-fc=/cygdrive/g/mypetsc/petsc-3.20.2/lib/petsc/bin/win32fe/win_ifort --with-cxx=/cygdrive/g/mypetsc/p > > > > etsc-3.20.2/lib/petsc/bin/win32fe/win_cl --with-blaslapack-lib=-L/cygdrive/g/Intel/oneAPI/mkl/2023.2.0/lib/intel64 mkl-intel-lp64-dll.lib mkl-sequential-dll.lib mkl-core-dll.lib --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include --with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/20 > > > > 21.10.0/lib/release/impi.lib --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec -localonly --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-475d8facbb32.tar.gz --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-ca7a59e6283f.tar.gz --with-strict-petscerrorcode=0 > > > > > <<< > > > > > > > > > > >>>>>>>> > > > > > Warning: win32fe: File Not Found: /Ox > > > > > Error: win32fe: Input File Not Found: G:\mypetsc\PETSC-~2.2\ARCH-M~1\EXTERN~1\PETSC-~1\PETSC-~1\libmetis\/Ox > > > > > >>>>>>>>>> > > > > > > > > > > Looks like you are using an old snapshot of metis. Can you remove your local tarballs - and let [cygwin] git download the appropriate latest version? > > > > > > > > > > Or download and use: https://urldefense.us/v3/__https://bitbucket.org/petsc/pkg-metis/get/8b194fdf09661ac41b36fa16db0474d38f46f1ac.tar.gz__;!!G_uCfscf7eWS!cI7AqtwOwG0MWFBbcOA813z8p7Q2IZcdv53HvzHMxK37qmlicGatMh0ya5WWcEVfiYZ5JDmS7vfgveYi05xU_O8moU6wWwAikw$ > > > > > Similarly for parmetis https://urldefense.us/v3/__https://bitbucket.org/petsc/pkg-parmetis/get/f5e3aab04fd5fe6e09fa02f885c1c29d349f9f8b.tar.gz__;!!G_uCfscf7eWS!cI7AqtwOwG0MWFBbcOA813z8p7Q2IZcdv53HvzHMxK37qmlicGatMh0ya5WWcEVfiYZ5JDmS7vfgveYi05xU_O8moU4L6tLXtg$ > > > > > > > > > > Satish > > > > > > > > > > On Wed, 20 Mar 2024, ?? wrote: > > > > > > > > > > > Hi I try petsc-3.?20.?2 and petsc-3.?20.?5 with configure ./configure --with-debugging=0 --with-cc=cl --with-fc=ifort --with-cxx=cl --with-blaslapack-lib=-L/cygdrive/g/Intel/oneAPI/mkl/2023.?2.?0/lib/intel64 mkl-intel-lp64-dll.?lib > > > > > > mkl-sequential-dll.?lib > > > > > > ZjQcmQRYFpfptBannerStart > > > > > > This Message Is From an External Sender > > > > > > This message came from outside your organization. > > > > > > > > > > > > ZjQcmQRYFpfptBannerEnd > > > > > > > > > > > > Hi > > > > > > I try petsc-3.20.2 and petsc-3.20.5 with configure > > > > > > > > > > > > ./configure --with-debugging=0 --with-cc=cl --with-fc=ifort --with-cxx=cl > > > > > > --with-blaslapack-lib=-L/cygdrive/g/Intel/oneAPI/mkl/2023.2.0/lib/intel64 mkl-intel-lp64-dll.lib mkl-sequential-dll.lib mkl-core-dll.lib > > > > > > --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include --with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/lib/release/impi.lib > > > > > > --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec -localonly > > > > > > --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-475d8facbb32.tar.gz > > > > > > --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-ca7a59e6283f.tar.gz > > > > > > --with-strict-petscerrorcode=0 > > > > > > > > > > > > but it encounter same question, > > > > > > > > > > > > ********************************************************************************************* > > > > > > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > > > > > > --------------------------------------------------------------------------------------------- > > > > > > Error running make on METIS > > > > > > ********************************************************************************************* > > > > > > > > > > > > and I see https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/jobs/6412623047__;!!G_uCfscf7eWS!YOO7nEnwU4BJQXD3WkP3QCvaT1gfLxBxnrNdXp9SJbjETmw7uaRKaUkPRPEgWgxibROg8o_rr8SxbnaVWtJbAT-t281f2ha4Aw$fora successful build of lates > > > > > > t petsc-3.20 , it seem have something called "sowing" and "bison" , but I don't have. > > > > > > > > > > > > So I ask for your help, and configure.log is attached. > > > > > > > > > > > > sinserely, > > > > > > Ben. > > > > > > > > > > > > > -----????----- > > > > > > > ???: "Satish Balay" > > > > > > > > ????:2024-03-20 00:48:57 (???) > > > > > > > ???: "Barry Smith" > > > > > > > > ??: ?? >, PETSc > > > > > > > > ??: Re: [petsc-users] Using PetscPartitioner on WINDOWS > > > > > > > > > > > > > > Check https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/jobs/6412623047__;!!G_uCfscf7eWS!YOO7nEnwU4BJQXD3WkP3QCvaT1gfLxBxnrNdXp9SJbjETmw7uaRKaUkPRPEgWgxibROg8o_rr8SxbnaVWtJbAT-t281f2ha4Aw$fora successful build of latest > > > > > > petsc-3.20 [i.e release branch in git] with metis and parmetis > > > > > > > > > > > > > > Note the usage: > > > > > > > > > > > > > > >>>>> > > > > > > > '--with-cc=cl', > > > > > > > '--with-cxx=cl', > > > > > > > '--with-fc=ifort', > > > > > > > <<<< > > > > > > > > > > > > > > Satish > > > > > > > > > > > > > > On Tue, 19 Mar 2024, Barry Smith wrote: > > > > > > > > > > > > > > > Are you not able to use PETSc 3.?20.?2 ? On Mar 19, 2024, at 5:?27 AM, ?? wrote: Hi,Barry I try to use PETSc version 3.?19.?5 on windows, but it encounter a problem. > > > > > > > > ********************************************************************************************* > > > > > > > > ZjQcmQRYFpfptBannerStart > > > > > > > > This Message Is From an External Sender > > > > > > > > This message came from outside your organization. > > > > > > > > > > > > > > > > ZjQcmQRYFpfptBannerEnd > > > > > > > > > > > > > > > > Are you not able to use PETSc 3.20.2 ? > > > > > > > > > > > > > > > > On Mar 19, 2024, at 5:27?AM, ?? > wrote: > > > > > > > > > > > > > > > > Hi,Barry > > > > > > > > > > > > > > > > I try to use PETSc version 3.19.5 on windows, but it encounter a problem. > > > > > > > > > > > > > > > > > > > > > > > > ********************************************************************************************* > > > > > > > > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > > > > > > > > --------------------------------------------------------------------------------------------- > > > > > > > > Error configuring METIS with CMake > > > > > > > > ********************************************************************************************* > > > > > > > > > > > > > > > > configure.log is attached. > > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your reply! > > > > > > > > > > > > > > > > sinserely, > > > > > > > > > > > > > > > > Ben. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -----????----- > > > > > > > > ???: "Barry Smith" > > > > > > > > > ????: 2024-03-18 21:11:14 (???) > > > > > > > > ???: ?? <202321009113 at mail.scut.edu.cn > > > > > > > > > ??: petsc-users at mcs.anl.gov > > > > > > > > ??: Re: [petsc-users] Using PetscPartitioner on WINDOWS > > > > > > > > > > > > > > > > > > > > > > > > Please switch to the latest PETSc version, it supports Metis and Parmetis on Windows. > > > > > > > > Barry > > > > > > > > > > > > > > > > > > > > > > > > On Mar 17, 2024, at 11:57?PM, ?? <202321009113 at mail.scut.edu.cn > wrote: > > > > > > > > > > > > > > > > This Message Is From an External Sender > > > > > > > > This message came from outside your organization. > > > > > > > > > > > > > > > > Hello? > > > > > > > > > > > > > > > > Recently I try to install PETSc with Cygwin since I'd like to use PETSc with Visual Studio on Windows10 plateform.For the sake of clarity, I firstly list the softwares/packages used below: > > > > > > > > 1. PETSc: version 3.16.5 > > > > > > > > 2. VS: version 2022 > > > > > > > > 3. Intel MPI: download Intel oneAPI Base Toolkit and HPC Toolkit > > > > > > > > 4. Cygwin > > > > > > > > > > > > > > > > > > > > > > > > On windows, > > > > > > > > Then I try to calculate a simple cantilever beam that use Tetrahedral mesh. So it's unstructured grid > > > > > > > > I use DMPlexCreateFromFile() to creat dmplex. > > > > > > > > > > > > > > > > And then I want to distributing the mesh for using PETSCPARTITIONERPARMETIS type(in my opinion this PetscPartitioner type maybe the best for dmplex, > > > > > > > > > > > > > > > > see fig 1 for my work to see different PetscPartitioner type about a cantilever beam in Linux system.) > > > > > > > > > > > > > > > > But unfortunatly, when i try to use parmetis on windows that configure PETSc as follows > > > > > > > > > > > > > > > > > > > > > > > > ./configure --with-debugging=0 --with-cc='win32fe cl' --with-fc='win32fe ifort' --with-cxx='win32fe cl' > > > > > > > > > > > > > > > > --download-fblaslapack=/cygdrive/g/mypetsc/petsc-pkg-fblaslapack-e8a03f57d64c.tar.gz --with-shared-libraries=0 > > > > > > > > > > > > > > > > --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include > > > > > > > > --with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/lib/release/impi.lib > > > > > > > > --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec > > > > > > > > --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-475d8facbb32.tar.gz > > > > > > > > --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-ca7a59e6283f.tar.gz > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > it shows that > > > > > > > > ******************************************************************************* > > > > > > > > External package metis does not support --download-metis with Microsoft compilers > > > > > > > > ******************************************************************************* > > > > > > > > configure.log and make.log is attached > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If I use PetscPartitioner Simple type the calculate time is much more than PETSCPARTITIONERPARMETIS type. > > > > > > > > > > > > > > > > So On windows system I want to use PetscPartitioner like parmetis , if there have any other PetscPartitioner type that can do the same work as parmetis, > > > > > > > > > > > > > > > > or I just try to download parmetis separatly on windows(like this website , https://urldefense.us/v3/__https://boogie.inm.ras.ru/terekhov/INMOST/-/wikis/0204-Compilation-ParMETIS-Windows__;!!G_uCfscf7eWS!YOO7nEnwU4BJQXD3WkP3QCvaT > > > > >> 1gfLxBxnrNdXp9SJbjETmw7uaRKaUkPRPEgWgxibROg8o_rr8SxbnaVWtJbAT-t2815v60Yvg$) > > > > > > > > > > > > > > > > and then use Visual Studio to use it's library I don't know in this way PETSc could use it successfully or not. > > > > > > > > > > > > > > > > > > > > > > > > So I wrrit this email to report my problem and ask for your help. > > > > > > > > > > > > > > > > Looking forward your reply! > > > > > > > > > > > > > > > > > > > > > > > > sinserely, > > > > > > > > Ben. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dontbugthedevs at proton.me Tue Mar 26 13:23:09 2024 From: dontbugthedevs at proton.me (Noam T.) Date: Tue, 26 Mar 2024 18:23:09 +0000 Subject: [petsc-users] FE Tabulation values Message-ID: Hello, I am trying to understand the FE Tabulation data obtained from e.g . PetscFEComputeTabulation. Using a 2D mesh with a single triangle, first order, with vertices (0,0), (0,1), (1,0) (see msh file attached), and a single quadrature point at (1/3, 1/3), one gets Nb = 6, Nc = 2, Nq = 1, and the arrays for the basis and first derivatives are of sizes [Nq x Nb x Nc] = 12 and[Nq x Nb x Nc x dim] = 24, respectively The values of these two arrays are: basis (T->T[0]) [-1/3, 0, 0, -1/3, 2/3, 0, 0, 2/3, 2/3, 0, 0, 2/3] deriv (T->T[1]) [-1/2, -1/2, 0, 0, 0, 0, -1/2, -1/2, 1/2, 0, 0, 0, 0, 0, 1/2, 0, 0, 1/2, 0, 0, 0, 0, 0, 1/2] How does one get these values? I can't quite find a way to relate them to evaluating the basis functions of a P1 triangle in the given quadrature point. Thanks, Noam -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: triang01.msh Type: model/mesh Size: 444 bytes Desc: not available URL: From knepley at gmail.com Tue Mar 26 18:17:07 2024 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 26 Mar 2024 19:17:07 -0400 Subject: [petsc-users] FE Tabulation values In-Reply-To: References: Message-ID: On Tue, Mar 26, 2024 at 2:23?PM Noam T. via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello, I am trying to understand the FE Tabulation data obtained from e. g > . PetscFEComputeTabulation. Using a 2D mesh with a single triangle, first > order, with vertices (0,0), (0,1), (1,0) (see msh file attached), and a > single quadrature point > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > Hello, > > I am trying to understand the FE Tabulation data obtained from e.g . > PetscFEComputeTabulation. Using a 2D mesh with a single triangle, first > order, with vertices (0,0), (0,1), (1,0) (see msh file attached), and a > single quadrature point at (1/3, 1/3), one gets Nb = 6, Nc = 2, Nq = 1, and > the arrays for the basis and first derivatives are of sizes [Nq x Nb x Nc] > = 12 and[Nq x Nb x Nc x dim] = 24, respectively > The tabulations from PetscFE are recorded on the reference cell. For triangles, the reference cell is (-1, -1) -- (1, -1) -- (-1, 1). The linear basis functions at these nodes are phi_0: -(x + y) / 2 phi_1: (x + 1) / 2 phi_2: (y + 1) / 2 and then you use the tensor product for Nc = 2. / phi_0 \ / 0 \ etc. \ 0 / \ phi_0 / The values of these two arrays are: > basis (T->T[0]) > [-1/3, 0, 0, -1/3, 2/3, 0, > 0, 2/3, 2/3, 0, 0, 2/3] > So these values are indeed the evaluations of those basis functions at (1/3, 1/3). The derivatives are similar. These are the evaluations you want if you are integrating in reference space, as we do for the finite element integrals, and also the only way we could use a single tabulation for the mesh. Thanks, Matt > deriv (T->T[1]) > [-1/2, -1/2, 0, 0, 0, 0, > -1/2, -1/2, 1/2, 0, 0, 0, > 0, 0, 1/2, 0, 0, 1/2, > 0, 0, 0, 0, 0, 1/2] > > How does one get these values? I can't quite find a way to relate them to > evaluating the basis functions of a P1 triangle in the given quadrature > point. > > Thanks, > Noam > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!d8GhOqpkJjEyg-Jdk6hvwyR7vg2vFqL9LClNxZuIOEKhjfIIOknalB0ZPB8qlb-PFQ9QOFJpkI3yqYMPcrLX$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From lzou at anl.gov Wed Mar 27 16:24:34 2024 From: lzou at anl.gov (Zou, Ling) Date: Wed, 27 Mar 2024 21:24:34 +0000 Subject: [petsc-users] Does ILU(15) still make sense or should just use LU? Message-ID: Hi, I?d like to avoid using LU, but in some cases to use ILU and still converge, I have to go to ILU(15), i.e., `-pc_factor_levels 15`. Does it still make sense, or should I give it up and switch to LU? For this particular case, ~2k DoF, and both ILU(15) and LU perform similarly in terms of wall time. -Ling -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Wed Mar 27 16:59:46 2024 From: hzhang at mcs.anl.gov (Zhang, Hong) Date: Wed, 27 Mar 2024 21:59:46 +0000 Subject: [petsc-users] Does ILU(15) still make sense or should just use LU? In-Reply-To: References: Message-ID: Ling, ILU(level) is used for saving storage space with more computations. Normally, we use level=1 or 2. It does not make sense to use level 15. If you have sufficient space, LU would be the best. Hong ________________________________ From: petsc-users on behalf of Zou, Ling via petsc-users Sent: Wednesday, March 27, 2024 4:24 PM To: petsc-users at mcs.anl.gov Subject: [petsc-users] Does ILU(15) still make sense or should just use LU? Hi, I?d like to avoid using LU, but in some cases to use ILU and still converge, I have to go to ILU(15), i.e., `-pc_factor_levels 15`. Does it still make sense, or should I give it up and switch to LU? For this particular case, ~2k DoF, and both ILU(15) and LU perform similarly in terms of wall time. -Ling -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin.diehl at kuleuven.be Thu Mar 28 05:29:20 2024 From: martin.diehl at kuleuven.be (Martin Diehl) Date: Thu, 28 Mar 2024 10:29:20 +0000 Subject: [petsc-users] Fortran interfaces: Google Summer of Code project? In-Reply-To: <878r28s34w.fsf@jedbrown.org> References: <8a1af6d55557ed5f1b98a5e7edcf0bc4f6b175a7.camel@kuleuven.be> <87frwjffks.fsf@jedbrown.org> <3C548E40-4FFF-47AB-8AFE-D32CC2191B28@petsc.dev> <877chvfc2n.fsf@jedbrown.org> <4D90CF28-E7F9-4F3B-B081-C8189899C34E@petsc.dev> <87edc2c0hn.fsf@jedbrown.org> <8621185c1419b35a1cbc7f5e2bc269eebd1b84ab.camel@kuleuven.be> <878r28s34w.fsf@jedbrown.org> Message-ID: <438dc657694cdf1ccc5e7cd4f4e318550f61726b.camel@kuleuven.be> pls find attached an image for advertising, I hope (mis)using the PETSc logo is ok. The deadline is already on April 2 On Sat, 2024-03-23 at 21:46 -0600, Jed Brown wrote: > Looks good to me. Thanks for taking the initiative. > > Martin Diehl writes: > > > Dear All, > > > > pls. find attached the proposal for > > https://github.com/fortran-lang/webpage/wiki/GSoC-2024-Project-ideas > > . > > > > I tried to keep it general such that we keep all options open. > > > > Let me know if you want to change/add/remove anything and/or if you > > want to be listed as mentor. > > > > Since I've mixed up the deadline, the most urgent task is the > > finding > > of suitable candidates. Once it's online, I'll post on linkedin but > > ideally we can motivate someone who is already known. > > > > best regards, > > Martin > > > > On Thu, 2024-03-21 at 23:13 -0600, Jed Brown wrote: > > > Barry Smith writes: > > > > > > > > We already have the generated ftn-auto-interfaces/*.h90. The > > > > > INTERFACE keyword could be replaced with CONTAINS (making > > > > > these > > > > > definitions instead of just interfaces), and then the bodies > > > > > could use iso_c_binding to call the C functions. That would > > > > > reduce repetition and be the standards-compliant way to do > > > > > this. > > > > > > > > ?? Sure, the interface and the stub go in the same file instead > > > > of > > > > two files. This is slightly nicer but not significantly > > > > simpler, > > > > and alone, it is not reason enough to write an entire new stub > > > > generator. > > > > > > I agree, but if one *is* writing a new stub generator for good > > > reasons (like better automation/completeness), there's a case for > > > doing it this way unless users really need an environment in > > > which > > > that feature can't be used. > > > > > > > > What we do now with detecting the Fortran mangling scheme and > > > > > calling conventions "works" but doesn't conform to any > > > > > standard > > > > > and there's nothing stopping Fortran implementations from > > > > > creating yet another variant that we have to deal with > > > > > manually. > > > > > > > > ?? From practical experience, calling C/Fortran using non- > > > > standards > > > > has only gotten easier over the last thirty-five years (fewer > > > > variants on how char* is handled); it has not gotten more > > > > complicated, so I submit the likelihood of "nothing stopping > > > > Fortran implementations from creating yet another variant that > > > > we > > > > have to deal with manually" is (though possible) rather > > > > unlikely. > > > > As far as I am concerned, much of iso_c_binding stuff just > > > > solved a > > > > problem that never really existed (except in some people's > > > > minds) > > > > since calling C/Fortran has always been easy, modulo knowing a > > > > tiny > > > > bit of information.. > > > > > > An examples for concreteness: > > > > > > https://fortranwiki.org/fortran/show/Generating+C+Interfaces > > > > > > And discussion: > > > > > > https://fortran-lang.discourse.group/t/iso-c-binding-looking-for-practical-example-of-how-it-helps-with-mangling/3393/8 > > > > > > With this approach, one could even use method syntax like > > > ksp%SetOperators(J, Jpre), as in the nlopt-f project linked in > > > the > > > top of this question. I don't know if we want that (it would be a > > > huge change for users, albeit it "easy"), but generating stubs in > > > Fortran using iso_c_binding opens a variety of possibilities for > > > more > > > idiomatic bindings. > > > > -- > > KU Leuven > > Department of Computer Science > > Department of Materials Engineering > > Celestijnenlaan 200a > > 3001 Leuven, Belgium > > # Improved generation of Fortran interfaces for > > [PETSc](https://petsc.org) > > > > PETSc, the Portable, Extensible Toolkit for Scientific Computation, > > pronounced PET-see, is for the scalable (parallel) solution of > > scientific applications modeled by partial differential equations > > (PDEs). > > It has bindings for C, Fortran, and Python (via petsc4py). PETSc > > also contains TAO, the Toolkit for Advanced Optimization, software > > library. It supports MPI, and GPUs through CUDA, HIP, Kokkos, or > > OpenCL, as well as hybrid MPI-GPU parallelism; it also supports the > > NEC-SX Tsubasa Vector Engine. > > > > Currently, only a part of the Fortran interfaces can be generated > > automatically using > > [bfort](http://wgropp.cs.illinois.edu/projects/software/sowing/bfor > > t/bfort.htm). > > Since the manual generation of the remaining interfaces is tedious > > and error prone, this project is about an improved generation of > > Fortran interfaces from PETSc's C code. > > > > The main tasks of this project are > > > > ?* Definition of a robust and future-proof structure for the > > Fortran interfaces > > ?* Selection and/or development of a tool that creates the > > interfaces automatically > > > > > > More specifically, the first task is about finding a suitable > > structure of the C-to-Fortran interface that reduces the need of > > 'stubs' on the C and Fortran side making use of modern Fortran > > features where appropriate. > > This task will involve evaluating different approaches found in > > other projects taking into account the object-oriented approach of > > PETSc. > > Prototypes will be implemented manually and evaluated with the help > > of the PETSc community. > > The second task is then the automated generation of the Fortran > > interfaces using the approach selected in the first task. > > To this end, it will be evaluated whether an extension of bfort, > > the use of another existing tool, or the development of a > > completely new tool (probably in Python) is the most suitable > > approach. > > > > **Links**: > > > > ?* [PETSc](https://petsc.org) > > ?* > > [bfort](http://wgropp.cs.illinois.edu/projects/software/sowing/bfor > > t/bfort.htm) > > ?* [Fortran Wiki: Generating C > > Interfaces](https://fortranwiki.org/fortran/show/Generating+C+Inter > > faces) > > ?* [Fortran Discourse: > > ISO_C_binding](https://fortran-lang.discourse.group/t/iso-c-binding > > -looking-for-practical-example-of-how-it-helps-with-mangling/3393) > > > > **Expected outcomes**: Stable and robust autogeneration of Fortran > > interfaces for PETSc that works for almost all routines > > > > **Skills preferred**: Programming experience in multiple languages, > > ideally C and/or Fortran > > > > **Difficulty**: Intermediate, 320 hours > > > > **Mentors**: Martin Diehl (@MarDiehl) -- KU Leuven Department of Computer Science Department of Materials Engineering Celestijnenlaan 200a 3001 Leuven, Belgium -------------- next part -------------- A non-text attachment was scrubbed... Name: petsc_loves_fortran.png Type: image/png Size: 146740 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 659 bytes Desc: This is a digitally signed message part URL: From mandhapati.raju at convergecfd.com Thu Mar 28 05:37:36 2024 From: mandhapati.raju at convergecfd.com (Raju Mandhapati) Date: Thu, 28 Mar 2024 16:07:36 +0530 Subject: [petsc-users] using custom matrix vector multiplication Message-ID: Hello, I want to use my own custom matrix vector multiplication routine (which will use finite difference method to calculate it). I will supply a matrix but that matrix should be used only for preconditioner and not for matrix vector multiplication. Is there a way to do it? thanks Raju. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu Mar 28 09:26:30 2024 From: jed at jedbrown.org (Jed Brown) Date: Thu, 28 Mar 2024 08:26:30 -0600 Subject: [petsc-users] using custom matrix vector multiplication In-Reply-To: References: Message-ID: <87o7ayv3e1.fsf@jedbrown.org> An HTML attachment was scrubbed... URL: From lzou at anl.gov Thu Mar 28 10:43:44 2024 From: lzou at anl.gov (Zou, Ling) Date: Thu, 28 Mar 2024 15:43:44 +0000 Subject: [petsc-users] Does ILU(15) still make sense or should just use LU? In-Reply-To: References: Message-ID: Hong, thanks! That makes perfect sense. A follow up question about ILU. The following is the performance of ILU(5). Note that each KPS solving reports converged but as the output shows, the preconditioned residual does while true residual does not. Is there any way this performance could be improved? Background: the preconditioning matrix is finite difference generated, and should be exact. -Ling Time Step 21, time = -491.75, dt = 1 NL Step = 0, fnorm = 6.98749E+01 0 KSP preconditioned resid norm 1.684131526824e+04 true resid norm 6.987489798042e+01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 5.970568556551e+02 true resid norm 6.459553545222e+01 ||r(i)||/||b|| 9.244455064582e-01 2 KSP preconditioned resid norm 3.349113985192e+02 true resid norm 7.250836872274e+01 ||r(i)||/||b|| 1.037688366186e+00 3 KSP preconditioned resid norm 3.290585904777e+01 true resid norm 1.186282435163e+02 ||r(i)||/||b|| 1.697723316169e+00 4 KSP preconditioned resid norm 8.530606201233e+00 true resid norm 4.088729421459e+01 ||r(i)||/||b|| 5.851499665310e-01 Linear solve converged due to CONVERGED_RTOL iterations 4 NL Step = 1, fnorm = 4.08788E+01 0 KSP preconditioned resid norm 1.851047973094e+03 true resid norm 4.087882723223e+01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 3.696809614513e+01 true resid norm 2.720016413105e+01 ||r(i)||/||b|| 6.653851387793e-01 2 KSP preconditioned resid norm 5.751891392534e+00 true resid norm 3.326338240872e+01 ||r(i)||/||b|| 8.137068663873e-01 3 KSP preconditioned resid norm 8.540729397958e-01 true resid norm 8.672410748720e+00 ||r(i)||/||b|| 2.121492062249e-01 Linear solve converged due to CONVERGED_RTOL iterations 3 NL Step = 2, fnorm = 8.67124E+00 0 KSP preconditioned resid norm 5.511333966852e+00 true resid norm 8.671237519593e+00 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.174962622023e+00 true resid norm 8.731034658309e+00 ||r(i)||/||b|| 1.006896032842e+00 2 KSP preconditioned resid norm 1.104604471016e+00 true resid norm 1.018397505468e+01 ||r(i)||/||b|| 1.174454630227e+00 3 KSP preconditioned resid norm 4.257063674222e-01 true resid norm 4.023093124996e+00 ||r(i)||/||b|| 4.639583584126e-01 4 KSP preconditioned resid norm 1.023038868263e-01 true resid norm 2.365298462869e+00 ||r(i)||/||b|| 2.727751901068e-01 5 KSP preconditioned resid norm 4.073772638935e-02 true resid norm 2.302623112025e+00 ||r(i)||/||b|| 2.655472309255e-01 6 KSP preconditioned resid norm 1.510323179379e-02 true resid norm 2.300216593521e+00 ||r(i)||/||b|| 2.652697020839e-01 7 KSP preconditioned resid norm 1.337324816903e-02 true resid norm 2.300057733345e+00 ||r(i)||/||b|| 2.652513817259e-01 8 KSP preconditioned resid norm 1.247384902656e-02 true resid norm 2.300456226062e+00 ||r(i)||/||b|| 2.652973374174e-01 9 KSP preconditioned resid norm 1.247038855375e-02 true resid norm 2.300532560993e+00 ||r(i)||/||b|| 2.653061406512e-01 10 KSP preconditioned resid norm 1.244611343317e-02 true resid norm 2.299441241514e+00 ||r(i)||/||b|| 2.651802855496e-01 11 KSP preconditioned resid norm 1.227243209527e-02 true resid norm 2.273668115236e+00 ||r(i)||/||b|| 2.622080308720e-01 12 KSP preconditioned resid norm 1.172621459354e-02 true resid norm 2.113927895437e+00 ||r(i)||/||b|| 2.437861828442e-01 13 KSP preconditioned resid norm 2.880752338189e-03 true resid norm 1.076190247720e-01 ||r(i)||/||b|| 1.241103412620e-02 Linear solve converged due to CONVERGED_RTOL iterations 13 NL Step = 3, fnorm = 1.59729E-01 0 KSP preconditioned resid norm 1.676948440854e+03 true resid norm 1.597288981238e-01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 2.266131510513e+00 true resid norm 1.819663943811e+00 ||r(i)||/||b|| 1.139220244542e+01 2 KSP preconditioned resid norm 2.239911493901e+00 true resid norm 1.923976907755e+00 ||r(i)||/||b|| 1.204526501062e+01 3 KSP preconditioned resid norm 1.446859034276e-01 true resid norm 8.692945031946e-01 ||r(i)||/||b|| 5.442312026225e+00 Linear solve converged due to CONVERGED_RTOL iterations 3 NL Step = 4, fnorm = 1.59564E-01 0 KSP preconditioned resid norm 1.509663716414e+03 true resid norm 1.595641817504e-01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.995956587709e+00 true resid norm 1.712323298361e+00 ||r(i)||/||b|| 1.073125108390e+01 2 KSP preconditioned resid norm 1.994336275847e+00 true resid norm 1.741263472491e+00 ||r(i)||/||b|| 1.091262119975e+01 3 KSP preconditioned resid norm 1.268035008497e-01 true resid norm 8.197057317360e-01 ||r(i)||/||b|| 5.137153731769e+00 Linear solve converged due to CONVERGED_RTOL iterations 3 Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 4 Solve Did NOT Converge! From: Zhang, Hong Date: Wednesday, March 27, 2024 at 4:59 PM To: petsc-users at mcs.anl.gov , Zou, Ling Subject: Re: Does ILU(15) still make sense or should just use LU? Ling, ILU(level) is used for saving storage space with more computations. Normally, we use level=1 or 2. It does not make sense to use level 15. If you have sufficient space, LU would be the best. Hong ________________________________ From: petsc-users on behalf of Zou, Ling via petsc-users Sent: Wednesday, March 27, 2024 4:24 PM To: petsc-users at mcs.anl.gov Subject: [petsc-users] Does ILU(15) still make sense or should just use LU? Hi, I?d like to avoid using LU, but in some cases to use ILU and still converge, I have to go to ILU(15), i.e., `-pc_factor_levels 15`. Does it still make sense, or should I give it up and switch to LU? For this particular case, ~2k DoF, and both ILU(15) and LU perform similarly in terms of wall time. -Ling -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Mar 28 11:13:20 2024 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 28 Mar 2024 12:13:20 -0400 Subject: [petsc-users] Does ILU(15) still make sense or should just use LU? In-Reply-To: References: Message-ID: <3F099517-7B9B-487D-956B-82BDC579D4C8@petsc.dev> This is a bad situation, the solver is not really converging. This can happen with ILU() sometimes, it so badly scales things that the preconditioned residual decreases a lot but the true residual is not really getting smaller. Since your matrices are small best to stick to LU. You can use -ksp_norm_type unpreconditioned to force the convergence test to use the true residual for a convergence test and the solver will discover that it is not converging. Barry > On Mar 28, 2024, at 11:43?AM, Zou, Ling via petsc-users wrote: > > Hong, thanks! That makes perfect sense. > A follow up question about ILU. > > The following is the performance of ILU(5). Note that each KPS solving reports converged but as the output shows, the preconditioned residual does while true residual does not. Is there any way this performance could be improved? > Background: the preconditioning matrix is finite difference generated, and should be exact. > > -Ling > > Time Step 21, time = -491.75, dt = 1 > NL Step = 0, fnorm = 6.98749E+01 > 0 KSP preconditioned resid norm 1.684131526824e+04 true resid norm 6.987489798042e+01 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 5.970568556551e+02 true resid norm 6.459553545222e+01 ||r(i)||/||b|| 9.244455064582e-01 > 2 KSP preconditioned resid norm 3.349113985192e+02 true resid norm 7.250836872274e+01 ||r(i)||/||b|| 1.037688366186e+00 > 3 KSP preconditioned resid norm 3.290585904777e+01 true resid norm 1.186282435163e+02 ||r(i)||/||b|| 1.697723316169e+00 > 4 KSP preconditioned resid norm 8.530606201233e+00 true resid norm 4.088729421459e+01 ||r(i)||/||b|| 5.851499665310e-01 > Linear solve converged due to CONVERGED_RTOL iterations 4 > NL Step = 1, fnorm = 4.08788E+01 > 0 KSP preconditioned resid norm 1.851047973094e+03 true resid norm 4.087882723223e+01 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 3.696809614513e+01 true resid norm 2.720016413105e+01 ||r(i)||/||b|| 6.653851387793e-01 > 2 KSP preconditioned resid norm 5.751891392534e+00 true resid norm 3.326338240872e+01 ||r(i)||/||b|| 8.137068663873e-01 > 3 KSP preconditioned resid norm 8.540729397958e-01 true resid norm 8.672410748720e+00 ||r(i)||/||b|| 2.121492062249e-01 > Linear solve converged due to CONVERGED_RTOL iterations 3 > NL Step = 2, fnorm = 8.67124E+00 > 0 KSP preconditioned resid norm 5.511333966852e+00 true resid norm 8.671237519593e+00 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 1.174962622023e+00 true resid norm 8.731034658309e+00 ||r(i)||/||b|| 1.006896032842e+00 > 2 KSP preconditioned resid norm 1.104604471016e+00 true resid norm 1.018397505468e+01 ||r(i)||/||b|| 1.174454630227e+00 > 3 KSP preconditioned resid norm 4.257063674222e-01 true resid norm 4.023093124996e+00 ||r(i)||/||b|| 4.639583584126e-01 > 4 KSP preconditioned resid norm 1.023038868263e-01 true resid norm 2.365298462869e+00 ||r(i)||/||b|| 2.727751901068e-01 > 5 KSP preconditioned resid norm 4.073772638935e-02 true resid norm 2.302623112025e+00 ||r(i)||/||b|| 2.655472309255e-01 > 6 KSP preconditioned resid norm 1.510323179379e-02 true resid norm 2.300216593521e+00 ||r(i)||/||b|| 2.652697020839e-01 > 7 KSP preconditioned resid norm 1.337324816903e-02 true resid norm 2.300057733345e+00 ||r(i)||/||b|| 2.652513817259e-01 > 8 KSP preconditioned resid norm 1.247384902656e-02 true resid norm 2.300456226062e+00 ||r(i)||/||b|| 2.652973374174e-01 > 9 KSP preconditioned resid norm 1.247038855375e-02 true resid norm 2.300532560993e+00 ||r(i)||/||b|| 2.653061406512e-01 > 10 KSP preconditioned resid norm 1.244611343317e-02 true resid norm 2.299441241514e+00 ||r(i)||/||b|| 2.651802855496e-01 > 11 KSP preconditioned resid norm 1.227243209527e-02 true resid norm 2.273668115236e+00 ||r(i)||/||b|| 2.622080308720e-01 > 12 KSP preconditioned resid norm 1.172621459354e-02 true resid norm 2.113927895437e+00 ||r(i)||/||b|| 2.437861828442e-01 > 13 KSP preconditioned resid norm 2.880752338189e-03 true resid norm 1.076190247720e-01 ||r(i)||/||b|| 1.241103412620e-02 > Linear solve converged due to CONVERGED_RTOL iterations 13 > NL Step = 3, fnorm = 1.59729E-01 > 0 KSP preconditioned resid norm 1.676948440854e+03 true resid norm 1.597288981238e-01 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 2.266131510513e+00 true resid norm 1.819663943811e+00 ||r(i)||/||b|| 1.139220244542e+01 > 2 KSP preconditioned resid norm 2.239911493901e+00 true resid norm 1.923976907755e+00 ||r(i)||/||b|| 1.204526501062e+01 > 3 KSP preconditioned resid norm 1.446859034276e-01 true resid norm 8.692945031946e-01 ||r(i)||/||b|| 5.442312026225e+00 > Linear solve converged due to CONVERGED_RTOL iterations 3 > NL Step = 4, fnorm = 1.59564E-01 > 0 KSP preconditioned resid norm 1.509663716414e+03 true resid norm 1.595641817504e-01 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 1.995956587709e+00 true resid norm 1.712323298361e+00 ||r(i)||/||b|| 1.073125108390e+01 > 2 KSP preconditioned resid norm 1.994336275847e+00 true resid norm 1.741263472491e+00 ||r(i)||/||b|| 1.091262119975e+01 > 3 KSP preconditioned resid norm 1.268035008497e-01 true resid norm 8.197057317360e-01 ||r(i)||/||b|| 5.137153731769e+00 > Linear solve converged due to CONVERGED_RTOL iterations 3 > Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 4 > Solve Did NOT Converge! > > > > From: Zhang, Hong > > Date: Wednesday, March 27, 2024 at 4:59 PM > To: petsc-users at mcs.anl.gov >, Zou, Ling > > Subject: Re: Does ILU(15) still make sense or should just use LU? > > Ling, > ILU(level) is used for saving storage space with more computations. Normally, we use level=1 or 2. It does not make sense to use level 15. If you have sufficient space, LU would be the best. > Hong > > From: petsc-users > on behalf of Zou, Ling via petsc-users > > Sent: Wednesday, March 27, 2024 4:24 PM > To: petsc-users at mcs.anl.gov > > Subject: [petsc-users] Does ILU(15) still make sense or should just use LU? > > Hi, I?d like to avoid using LU, but in some cases to use ILU and still converge, I have to go to ILU(15), i.e., `-pc_factor_levels 15`. Does it still make sense, or should I give it up and switch to LU? > > > > For this particular case, ~2k DoF, and both ILU(15) and LU perform similarly in terms of wall time. > > > > -Ling > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lzou at anl.gov Thu Mar 28 12:14:38 2024 From: lzou at anl.gov (Zou, Ling) Date: Thu, 28 Mar 2024 17:14:38 +0000 Subject: [petsc-users] Does ILU(15) still make sense or should just use LU? In-Reply-To: <3F099517-7B9B-487D-956B-82BDC579D4C8@petsc.dev> References: <3F099517-7B9B-487D-956B-82BDC579D4C8@petsc.dev> Message-ID: Thank you, Barry. Yeah, this is unfortunate given that the problem we are handling is quite heterogeneous (in both mesh and physics). I expect that our problem sizes will be mostly smaller than 1 million DOF, should LU still be a practical solution? Can it scale well if we choose to run the problem in a parallel way? PS1: -ksp_norm_type unpreconditioned did not work as the true residual did not go down, even with 300 linear iterations. PS2: what do you think if it will be beneficial to have more detailed discussions (e.g., a presentation?) on the problem we are solving to seek more advice? -Ling From: Barry Smith Date: Thursday, March 28, 2024 at 11:14 AM To: Zou, Ling Cc: Zhang, Hong , petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Does ILU(15) still make sense or should just use LU? This is a bad situation, the solver is not really converging. This can happen with ILU() sometimes, it so badly scales things that the preconditioned residual decreases a lot but the true residual is not really getting smaller. Since your matrices ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd This is a bad situation, the solver is not really converging. This can happen with ILU() sometimes, it so badly scales things that the preconditioned residual decreases a lot but the true residual is not really getting smaller. Since your matrices are small best to stick to LU. You can use -ksp_norm_type unpreconditioned to force the convergence test to use the true residual for a convergence test and the solver will discover that it is not converging. Barry On Mar 28, 2024, at 11:43?AM, Zou, Ling via petsc-users wrote: Hong, thanks! That makes perfect sense. A follow up question about ILU. The following is the performance of ILU(5). Note that each KPS solving reports converged but as the output shows, the preconditioned residual does while true residual does not. Is there any way this performance could be improved? Background: the preconditioning matrix is finite difference generated, and should be exact. -Ling Time Step 21, time = -491.75, dt = 1 NL Step = 0, fnorm = 6.98749E+01 0 KSP preconditioned resid norm 1.684131526824e+04 true resid norm 6.987489798042e+01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 5.970568556551e+02 true resid norm 6.459553545222e+01 ||r(i)||/||b|| 9.244455064582e-01 2 KSP preconditioned resid norm 3.349113985192e+02 true resid norm 7.250836872274e+01 ||r(i)||/||b|| 1.037688366186e+00 3 KSP preconditioned resid norm 3.290585904777e+01 true resid norm 1.186282435163e+02 ||r(i)||/||b|| 1.697723316169e+00 4 KSP preconditioned resid norm 8.530606201233e+00 true resid norm 4.088729421459e+01 ||r(i)||/||b|| 5.851499665310e-01 Linear solve converged due to CONVERGED_RTOL iterations 4 NL Step = 1, fnorm = 4.08788E+01 0 KSP preconditioned resid norm 1.851047973094e+03 true resid norm 4.087882723223e+01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 3.696809614513e+01 true resid norm 2.720016413105e+01 ||r(i)||/||b|| 6.653851387793e-01 2 KSP preconditioned resid norm 5.751891392534e+00 true resid norm 3.326338240872e+01 ||r(i)||/||b|| 8.137068663873e-01 3 KSP preconditioned resid norm 8.540729397958e-01 true resid norm 8.672410748720e+00 ||r(i)||/||b|| 2.121492062249e-01 Linear solve converged due to CONVERGED_RTOL iterations 3 NL Step = 2, fnorm = 8.67124E+00 0 KSP preconditioned resid norm 5.511333966852e+00 true resid norm 8.671237519593e+00 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.174962622023e+00 true resid norm 8.731034658309e+00 ||r(i)||/||b|| 1.006896032842e+00 2 KSP preconditioned resid norm 1.104604471016e+00 true resid norm 1.018397505468e+01 ||r(i)||/||b|| 1.174454630227e+00 3 KSP preconditioned resid norm 4.257063674222e-01 true resid norm 4.023093124996e+00 ||r(i)||/||b|| 4.639583584126e-01 4 KSP preconditioned resid norm 1.023038868263e-01 true resid norm 2.365298462869e+00 ||r(i)||/||b|| 2.727751901068e-01 5 KSP preconditioned resid norm 4.073772638935e-02 true resid norm 2.302623112025e+00 ||r(i)||/||b|| 2.655472309255e-01 6 KSP preconditioned resid norm 1.510323179379e-02 true resid norm 2.300216593521e+00 ||r(i)||/||b|| 2.652697020839e-01 7 KSP preconditioned resid norm 1.337324816903e-02 true resid norm 2.300057733345e+00 ||r(i)||/||b|| 2.652513817259e-01 8 KSP preconditioned resid norm 1.247384902656e-02 true resid norm 2.300456226062e+00 ||r(i)||/||b|| 2.652973374174e-01 9 KSP preconditioned resid norm 1.247038855375e-02 true resid norm 2.300532560993e+00 ||r(i)||/||b|| 2.653061406512e-01 10 KSP preconditioned resid norm 1.244611343317e-02 true resid norm 2.299441241514e+00 ||r(i)||/||b|| 2.651802855496e-01 11 KSP preconditioned resid norm 1.227243209527e-02 true resid norm 2.273668115236e+00 ||r(i)||/||b|| 2.622080308720e-01 12 KSP preconditioned resid norm 1.172621459354e-02 true resid norm 2.113927895437e+00 ||r(i)||/||b|| 2.437861828442e-01 13 KSP preconditioned resid norm 2.880752338189e-03 true resid norm 1.076190247720e-01 ||r(i)||/||b|| 1.241103412620e-02 Linear solve converged due to CONVERGED_RTOL iterations 13 NL Step = 3, fnorm = 1.59729E-01 0 KSP preconditioned resid norm 1.676948440854e+03 true resid norm 1.597288981238e-01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 2.266131510513e+00 true resid norm 1.819663943811e+00 ||r(i)||/||b|| 1.139220244542e+01 2 KSP preconditioned resid norm 2.239911493901e+00 true resid norm 1.923976907755e+00 ||r(i)||/||b|| 1.204526501062e+01 3 KSP preconditioned resid norm 1.446859034276e-01 true resid norm 8.692945031946e-01 ||r(i)||/||b|| 5.442312026225e+00 Linear solve converged due to CONVERGED_RTOL iterations 3 NL Step = 4, fnorm = 1.59564E-01 0 KSP preconditioned resid norm 1.509663716414e+03 true resid norm 1.595641817504e-01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.995956587709e+00 true resid norm 1.712323298361e+00 ||r(i)||/||b|| 1.073125108390e+01 2 KSP preconditioned resid norm 1.994336275847e+00 true resid norm 1.741263472491e+00 ||r(i)||/||b|| 1.091262119975e+01 3 KSP preconditioned resid norm 1.268035008497e-01 true resid norm 8.197057317360e-01 ||r(i)||/||b|| 5.137153731769e+00 Linear solve converged due to CONVERGED_RTOL iterations 3 Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 4 Solve Did NOT Converge! From: Zhang, Hong > Date: Wednesday, March 27, 2024 at 4:59 PM To: petsc-users at mcs.anl.gov >, Zou, Ling > Subject: Re: Does ILU(15) still make sense or should just use LU? Ling, ILU(level) is used for saving storage space with more computations. Normally, we use level=1 or 2. It does not make sense to use level 15. If you have sufficient space, LU would be the best. Hong ________________________________ From: petsc-users > on behalf of Zou, Ling via petsc-users > Sent: Wednesday, March 27, 2024 4:24 PM To: petsc-users at mcs.anl.gov > Subject: [petsc-users] Does ILU(15) still make sense or should just use LU? Hi, I?d like to avoid using LU, but in some cases to use ILU and still converge, I have to go to ILU(15), i.e., `-pc_factor_levels 15`. Does it still make sense, or should I give it up and switch to LU? For this particular case, ~2k DoF, and both ILU(15) and LU perform similarly in terms of wall time. -Ling -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Mar 28 13:08:52 2024 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 28 Mar 2024 14:08:52 -0400 Subject: [petsc-users] Does ILU(15) still make sense or should just use LU? In-Reply-To: References: <3F099517-7B9B-487D-956B-82BDC579D4C8@petsc.dev> Message-ID: 1 million is possible for direct solvers using PETSc with the MUMPS direct solver when you cannot get a preconditioner to work well for your problems. ILU are not very robust preconditioners and I would not rely on them. Have you investigated other preconditioners in PETSc, PCGAMG, PCASM, PCFIELDSPLIT or some combination of these preconditioners work for many problems, though certainly not all. > On Mar 28, 2024, at 1:14?PM, Zou, Ling wrote: > > Thank you, Barry. > Yeah, this is unfortunate given that the problem we are handling is quite heterogeneous (in both mesh and physics). > I expect that our problem sizes will be mostly smaller than 1 million DOF, should LU still be a practical solution? Can it scale well if we choose to run the problem in a parallel way? > > PS1: -ksp_norm_type unpreconditioned did not work as the true residual did not go down, even with 300 linear iterations. > PS2: what do you think if it will be beneficial to have more detailed discussions (e.g., a presentation?) on the problem we are solving to seek more advice? > > -Ling > > From: Barry Smith > > Date: Thursday, March 28, 2024 at 11:14 AM > To: Zou, Ling > > Cc: Zhang, Hong >, petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Does ILU(15) still make sense or should just use LU? > > This is a bad situation, the solver is not really converging. This can happen with ILU() sometimes, it so badly scales things that the preconditioned residual decreases a lot but the true residual is not really getting smaller. Since your matrices > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > This is a bad situation, the solver is not really converging. This can happen with ILU() sometimes, it so badly scales things that the preconditioned residual decreases a lot but the true residual is not really getting smaller. Since your matrices are small best to stick to LU. > > You can use -ksp_norm_type unpreconditioned to force the convergence test to use the true residual for a convergence test and the solver will discover that it is not converging. > > Barry > > > > On Mar 28, 2024, at 11:43?AM, Zou, Ling via petsc-users > wrote: > > Hong, thanks! That makes perfect sense. > A follow up question about ILU. > > The following is the performance of ILU(5). Note that each KPS solving reports converged but as the output shows, the preconditioned residual does while true residual does not. Is there any way this performance could be improved? > Background: the preconditioning matrix is finite difference generated, and should be exact. > > -Ling > > Time Step 21, time = -491.75, dt = 1 > NL Step = 0, fnorm = 6.98749E+01 > 0 KSP preconditioned resid norm 1.684131526824e+04 true resid norm 6.987489798042e+01 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 5.970568556551e+02 true resid norm 6.459553545222e+01 ||r(i)||/||b|| 9.244455064582e-01 > 2 KSP preconditioned resid norm 3.349113985192e+02 true resid norm 7.250836872274e+01 ||r(i)||/||b|| 1.037688366186e+00 > 3 KSP preconditioned resid norm 3.290585904777e+01 true resid norm 1.186282435163e+02 ||r(i)||/||b|| 1.697723316169e+00 > 4 KSP preconditioned resid norm 8.530606201233e+00 true resid norm 4.088729421459e+01 ||r(i)||/||b|| 5.851499665310e-01 > Linear solve converged due to CONVERGED_RTOL iterations 4 > NL Step = 1, fnorm = 4.08788E+01 > 0 KSP preconditioned resid norm 1.851047973094e+03 true resid norm 4.087882723223e+01 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 3.696809614513e+01 true resid norm 2.720016413105e+01 ||r(i)||/||b|| 6.653851387793e-01 > 2 KSP preconditioned resid norm 5.751891392534e+00 true resid norm 3.326338240872e+01 ||r(i)||/||b|| 8.137068663873e-01 > 3 KSP preconditioned resid norm 8.540729397958e-01 true resid norm 8.672410748720e+00 ||r(i)||/||b|| 2.121492062249e-01 > Linear solve converged due to CONVERGED_RTOL iterations 3 > NL Step = 2, fnorm = 8.67124E+00 > 0 KSP preconditioned resid norm 5.511333966852e+00 true resid norm 8.671237519593e+00 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 1.174962622023e+00 true resid norm 8.731034658309e+00 ||r(i)||/||b|| 1.006896032842e+00 > 2 KSP preconditioned resid norm 1.104604471016e+00 true resid norm 1.018397505468e+01 ||r(i)||/||b|| 1.174454630227e+00 > 3 KSP preconditioned resid norm 4.257063674222e-01 true resid norm 4.023093124996e+00 ||r(i)||/||b|| 4.639583584126e-01 > 4 KSP preconditioned resid norm 1.023038868263e-01 true resid norm 2.365298462869e+00 ||r(i)||/||b|| 2.727751901068e-01 > 5 KSP preconditioned resid norm 4.073772638935e-02 true resid norm 2.302623112025e+00 ||r(i)||/||b|| 2.655472309255e-01 > 6 KSP preconditioned resid norm 1.510323179379e-02 true resid norm 2.300216593521e+00 ||r(i)||/||b|| 2.652697020839e-01 > 7 KSP preconditioned resid norm 1.337324816903e-02 true resid norm 2.300057733345e+00 ||r(i)||/||b|| 2.652513817259e-01 > 8 KSP preconditioned resid norm 1.247384902656e-02 true resid norm 2.300456226062e+00 ||r(i)||/||b|| 2.652973374174e-01 > 9 KSP preconditioned resid norm 1.247038855375e-02 true resid norm 2.300532560993e+00 ||r(i)||/||b|| 2.653061406512e-01 > 10 KSP preconditioned resid norm 1.244611343317e-02 true resid norm 2.299441241514e+00 ||r(i)||/||b|| 2.651802855496e-01 > 11 KSP preconditioned resid norm 1.227243209527e-02 true resid norm 2.273668115236e+00 ||r(i)||/||b|| 2.622080308720e-01 > 12 KSP preconditioned resid norm 1.172621459354e-02 true resid norm 2.113927895437e+00 ||r(i)||/||b|| 2.437861828442e-01 > 13 KSP preconditioned resid norm 2.880752338189e-03 true resid norm 1.076190247720e-01 ||r(i)||/||b|| 1.241103412620e-02 > Linear solve converged due to CONVERGED_RTOL iterations 13 > NL Step = 3, fnorm = 1.59729E-01 > 0 KSP preconditioned resid norm 1.676948440854e+03 true resid norm 1.597288981238e-01 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 2.266131510513e+00 true resid norm 1.819663943811e+00 ||r(i)||/||b|| 1.139220244542e+01 > 2 KSP preconditioned resid norm 2.239911493901e+00 true resid norm 1.923976907755e+00 ||r(i)||/||b|| 1.204526501062e+01 > 3 KSP preconditioned resid norm 1.446859034276e-01 true resid norm 8.692945031946e-01 ||r(i)||/||b|| 5.442312026225e+00 > Linear solve converged due to CONVERGED_RTOL iterations 3 > NL Step = 4, fnorm = 1.59564E-01 > 0 KSP preconditioned resid norm 1.509663716414e+03 true resid norm 1.595641817504e-01 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 1.995956587709e+00 true resid norm 1.712323298361e+00 ||r(i)||/||b|| 1.073125108390e+01 > 2 KSP preconditioned resid norm 1.994336275847e+00 true resid norm 1.741263472491e+00 ||r(i)||/||b|| 1.091262119975e+01 > 3 KSP preconditioned resid norm 1.268035008497e-01 true resid norm 8.197057317360e-01 ||r(i)||/||b|| 5.137153731769e+00 > Linear solve converged due to CONVERGED_RTOL iterations 3 > Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 4 > Solve Did NOT Converge! > > > > From: Zhang, Hong > > Date: Wednesday, March 27, 2024 at 4:59 PM > To: petsc-users at mcs.anl.gov >, Zou, Ling > > Subject: Re: Does ILU(15) still make sense or should just use LU? > > Ling, > ILU(level) is used for saving storage space with more computations. Normally, we use level=1 or 2. It does not make sense to use level 15. If you have sufficient space, LU would be the best. > Hong > > From: petsc-users > on behalf of Zou, Ling via petsc-users > > Sent: Wednesday, March 27, 2024 4:24 PM > To: petsc-users at mcs.anl.gov > > Subject: [petsc-users] Does ILU(15) still make sense or should just use LU? > > Hi, I?d like to avoid using LU, but in some cases to use ILU and still converge, I have to go to ILU(15), i.e., `-pc_factor_levels 15`. Does it still make sense, or should I give it up and switch to LU? > > > > For this particular case, ~2k DoF, and both ILU(15) and LU perform similarly in terms of wall time. > > > > -Ling > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lzou at anl.gov Thu Mar 28 13:20:41 2024 From: lzou at anl.gov (Zou, Ling) Date: Thu, 28 Mar 2024 18:20:41 +0000 Subject: [petsc-users] Does ILU(15) still make sense or should just use LU? In-Reply-To: References: <3F099517-7B9B-487D-956B-82BDC579D4C8@petsc.dev> Message-ID: Thank you, Barry. Yes, I have tried different preconditioners, but in a na?ve way, i.e., looping through possible options using `-pc_type