<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
Hi Junchao, both the slurm <span style="font-family: "Courier New", monospace;">scontrol show job_id -dd</span> and looking at
<span style="font-family: "Courier New", monospace;">CUDA_VISIBLE_DEVICES</span> does not provide information about which MPI process is associated to which GPU in the node in our system. I can see this with nvidia-smi, but if you have any other suggestion
using slurm I would like to hear it.</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
I've been trying to compile the code+Petsc in summit, but have been having all sorts of issues related to spectrum-mpi, and the different compilers they provide (I tried gcc, nvhpc, pgi, xl. Some of them don't handle Fortran 2018, others give issues of repeated
MPI definitions, etc.). </div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
I also wanted to ask you, do you know if it is possible to compile PETSc with the xl/16.1.1-10 suite? </div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
Thanks!<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
I configured the library --with-cuda and when compiling I get a compilation error with CUDAC:</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof ContentPasted0">
<span style="font-family: "Courier New", monospace;">CUDAC arch-linux-opt-xl/obj/src/sys/classes/random/impls/curand/curand2.o</span>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/sys/classes/random/impls/curand/curand2.cu:1:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petsc/private/randomimpl.h:5:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petsc/private/petscimpl.h:7:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petscsys.h:44:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petscsystypes.h:532:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/thrust/complex.h:24:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/thrust/detail/config.h:23:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/thrust/detail/config/config.h:27:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/sw/summit/cuda/11.7.1/include/thrust/detail/config/cpp_dialect.h:112:6: warning: Thrust requires at least Clang 7.0. Define THRUST_IGNORE_DEPRECATED_CPP_DIALECT to suppress this
message. [-W#pragma-messages]</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> THRUST_COMPILER_DEPRECATION(Clang 7.0);</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/sw/summit/cuda/11.7.1/include/thrust/detail/config/cpp_dialect.h:101:3: note: expanded from macro 'THRUST_COMPILER_DEPRECATION'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> THRUST_COMP_DEPR_IMPL(Thrust requires at least REQ. Define THRUST_IGNORE_DEPRECATED_CPP_DIALECT to suppress this message.)</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/sw/summit/cuda/11.7.1/include/thrust/detail/config/cpp_dialect.h:95:38: note: expanded from macro 'THRUST_COMP_DEPR_IMPL'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"># define THRUST_COMP_DEPR_IMPL(msg) THRUST_COMP_DEPR_IMPL0(GCC warning #msg)</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/sw/summit/cuda/11.7.1/include/thrust/detail/config/cpp_dialect.h:96:40: note: expanded from macro 'THRUST_COMP_DEPR_IMPL0'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"># define THRUST_COMP_DEPR_IMPL0(expr) _Pragma(#expr)</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"><scratch space>:141:6: note: expanded from here</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> GCC warning "Thrust requires at least Clang 7.0. Define THRUST_IGNORE_DEPRECATED_CPP_DIALECT to suppress this message."</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/sys/classes/random/impls/curand/curand2.cu:2:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/thrust/transform.h:721:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/thrust/detail/transform.inl:27:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/thrust/system/detail/generic/transform.h:104:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/thrust/system/detail/generic/transform.inl:19:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/thrust/for_each.h:277:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/thrust/detail/for_each.inl:27:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/thrust/system/detail/adl/for_each.h:42:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/thrust/system/cuda/detail/for_each.h:35:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/thrust/system/cuda/detail/util.h:36:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/cub/detail/device_synchronize.cuh:19:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/cub/util_arch.cuh:36:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/sw/summit/cuda/11.7.1/include/cub/util_cpp_dialect.cuh:123:6: warning: CUB requires at least Clang 7.0. Define CUB_IGNORE_DEPRECATED_CPP_DIALECT to suppress this message. [-W#pragma-messages]</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> CUB_COMPILER_DEPRECATION(Clang 7.0);</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/sw/summit/cuda/11.7.1/include/cub/util_cpp_dialect.cuh:112:3: note: expanded from macro 'CUB_COMPILER_DEPRECATION'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> CUB_COMP_DEPR_IMPL(CUB requires at least REQ. Define CUB_IGNORE_DEPRECATED_CPP_DIALECT to suppress this message.)</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/sw/summit/cuda/11.7.1/include/cub/util_cpp_dialect.cuh:106:35: note: expanded from macro 'CUB_COMP_DEPR_IMPL'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"># define CUB_COMP_DEPR_IMPL(msg) CUB_COMP_DEPR_IMPL0(GCC warning #msg)</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/sw/summit/cuda/11.7.1/include/cub/util_cpp_dialect.cuh:107:37: note: expanded from macro 'CUB_COMP_DEPR_IMPL0'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"># define CUB_COMP_DEPR_IMPL0(expr) _Pragma(#expr)</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"><scratch space>:198:6: note: expanded from here</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> GCC warning "CUB requires at least Clang 7.0. Define CUB_IGNORE_DEPRECATED_CPP_DIALECT to suppress this message."</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petscsystypes.h(68): warning #1835-D: attribute "warn_unused_result" does not apply here</span></div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/sys/classes/random/impls/curand/curand2.cu:1:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petsc/private/randomimpl.h:5:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petsc/private/petscimpl.h:7:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petscsys.h:44:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petscsystypes.h:532:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/thrust/complex.h:24:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/thrust/detail/config.h:23:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/thrust/detail/config/config.h:27:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/sw/summit/cuda/11.7.1/include/thrust/detail/config/cpp_dialect.h:112:6: warning: Thrust requires at least Clang 7.0. Define THRUST_IGNORE_DEPRECATED_CPP_DIALECT to suppress this
message. [-W#pragma-messages]</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> THRUST_COMPILER_DEPRECATION(Clang 7.0);</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/sw/summit/cuda/11.7.1/include/thrust/detail/config/cpp_dialect.h:101:3: note: expanded from macro 'THRUST_COMPILER_DEPRECATION'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> THRUST_COMP_DEPR_IMPL(Thrust requires at least REQ. Define THRUST_IGNORE_DEPRECATED_CPP_DIALECT to suppress this message.)</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/sw/summit/cuda/11.7.1/include/thrust/detail/config/cpp_dialect.h:95:38: note: expanded from macro 'THRUST_COMP_DEPR_IMPL'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"># define THRUST_COMP_DEPR_IMPL(msg) THRUST_COMP_DEPR_IMPL0(GCC warning #msg)</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/sw/summit/cuda/11.7.1/include/thrust/detail/config/cpp_dialect.h:96:40: note: expanded from macro 'THRUST_COMP_DEPR_IMPL0'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"># define THRUST_COMP_DEPR_IMPL0(expr) _Pragma(#expr)</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"><scratch space>:149:6: note: expanded from here</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> GCC warning "Thrust requires at least Clang 7.0. Define THRUST_IGNORE_DEPRECATED_CPP_DIALECT to suppress this message."</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /autofs/nccs-svm1_home1/vanellam/Software/petsc/src/sys/classes/random/impls/curand/curand2.cu:2:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/thrust/transform.h:721:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/thrust/detail/transform.inl:27:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/thrust/system/detail/generic/transform.h:104:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/thrust/system/detail/generic/transform.inl:19:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/thrust/for_each.h:277:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/thrust/detail/for_each.inl:27:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/thrust/system/detail/adl/for_each.h:42:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/thrust/system/cuda/detail/for_each.h:35:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/thrust/system/cuda/detail/util.h:36:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/cub/detail/device_synchronize.cuh:19:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">In file included from /sw/summit/cuda/11.7.1/include/cub/util_arch.cuh:36:</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/sw/summit/cuda/11.7.1/include/cub/util_cpp_dialect.cuh:123:6: warning: CUB requires at least Clang 7.0. Define CUB_IGNORE_DEPRECATED_CPP_DIALECT to suppress this message. [-W#pragma-messages]</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> CUB_COMPILER_DEPRECATION(Clang 7.0);</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/sw/summit/cuda/11.7.1/include/cub/util_cpp_dialect.cuh:112:3: note: expanded from macro 'CUB_COMPILER_DEPRECATION'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> CUB_COMP_DEPR_IMPL(CUB requires at least REQ. Define CUB_IGNORE_DEPRECATED_CPP_DIALECT to suppress this message.)</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/sw/summit/cuda/11.7.1/include/cub/util_cpp_dialect.cuh:106:35: note: expanded from macro 'CUB_COMP_DEPR_IMPL'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"># define CUB_COMP_DEPR_IMPL(msg) CUB_COMP_DEPR_IMPL0(GCC warning #msg)</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/sw/summit/cuda/11.7.1/include/cub/util_cpp_dialect.cuh:107:37: note: expanded from macro 'CUB_COMP_DEPR_IMPL0'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"># define CUB_COMP_DEPR_IMPL0(expr) _Pragma(#expr)</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"><scratch space>:208:6: note: expanded from here</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> GCC warning "CUB requires at least Clang 7.0. Define CUB_IGNORE_DEPRECATED_CPP_DIALECT to suppress this message."</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petscsystypes.h(68): warning #1835-D: attribute "warn_unused_result" does not apply here</span></div>
<div><br class="ContentPasted0">
</div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petscstring.h:55:3: error: use of undeclared identifier '__builtin_assume'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">; __builtin_assume(a);
</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petscstring.h:78:3: error: use of undeclared identifier '__builtin_assume'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">; __builtin_assume(a);
</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petscstring.h:107:3: error: use of undeclared identifier '__builtin_assume'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">; __builtin_assume(len);
</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petscstring.h:144:3: error: use of undeclared identifier '__builtin_assume'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">; __builtin_assume(t);
</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petscstring.h:150:3: error: use of undeclared identifier '__builtin_assume'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">; __builtin_assume(s);
</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petscstring.h:198:3: error: use of undeclared identifier '__builtin_assume'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">; __builtin_assume(flg);
</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petscstring.h:249:3: error: use of undeclared identifier '__builtin_assume'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">; __builtin_assume(n);
</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petscstring.h:251:3: error: use of undeclared identifier '__builtin_assume'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">; __builtin_assume(s);
</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petscstring.h:291:3: error: use of undeclared identifier '__builtin_assume'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">; __builtin_assume(n);
</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petscstring.h:330:3: error: use of undeclared identifier '__builtin_assume'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">; __builtin_assume(t);
</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petscstring.h:333:3: error: use of undeclared identifier '__builtin_assume'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">; __builtin_assume(a);
</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petscstring.h:334:3: error: use of undeclared identifier '__builtin_assume'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">; __builtin_assume(b);
</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petscstring.h:367:3: error: use of undeclared identifier '__builtin_assume'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">; __builtin_assume(a);
</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petscstring.h:368:3: error: use of undeclared identifier '__builtin_assume'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">; __builtin_assume(b);
</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petscstring.h:369:3: error: use of undeclared identifier '__builtin_assume'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">; __builtin_assume(tmp);
</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petscstring.h:403:3: error: use of undeclared identifier '__builtin_assume'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">; __builtin_assume(haystack);
</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petscstring.h:404:3: error: use of undeclared identifier '__builtin_assume'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">; __builtin_assume(needle);
</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petscstring.h:405:3: error: use of undeclared identifier '__builtin_assume'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">; __builtin_assume(tmp);
</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">/autofs/nccs-svm1_home1/vanellam/Software/petsc/include/petscstring.h:437:3: error: use of undeclared identifier '__builtin_assume'</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">; __builtin_assume(t);
</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> ^</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">fatal error: too many errors emitted, stopping now [-ferror-limit=]</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">20 errors generated.</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">Error while processing /tmp/tmpxft_0001add6_00000000-6_curand2.cudafe1.cpp.</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">gmake[3]: *** [gmakefile:209: arch-linux-opt-xl/obj/src/sys/classes/random/impls/curand/curand2.o] Error 1</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">gmake[2]: *** [/autofs/nccs-svm1_home1/vanellam/Software/petsc/lib/petsc/conf/rules.doc:28: libs] Error 2</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">**************************ERROR*************************************</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> Error during compile, check arch-linux-opt-xl/lib/petsc/conf/make.log</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;"> Send it and arch-linux-opt-xl/lib/petsc/conf/configure.log to petsc-maint@mcs.anl.gov</span></div>
<div class="ContentPasted0"><span style="font-family: "Courier New", monospace;">********************************************************************</span></div>
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);" class="elementToProof">
<br>
</div>
<div id="appendonsend"></div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> Junchao Zhang <junchao.zhang@gmail.com><br>
<b>Sent:</b> Monday, August 21, 2023 4:17 PM<br>
<b>To:</b> Vanella, Marcos (Fed) <marcos.vanella@nist.gov><br>
<b>Cc:</b> PETSc users list <petsc-users@mcs.anl.gov>; Guan, Collin X. (Fed) <collin.guan@nist.gov><br>
<b>Subject:</b> Re: [petsc-users] CUDA error trying to run a job with two mpi processes and 1 GPU</font>
<div> </div>
</div>
<div>
<div dir="ltr">That is a good question. Looking at <a href="https://slurm.schedmd.com/gres.html#GPU_Management" originalsrc="https://slurm.schedmd.com/gres.html#GPU_Management" shash="iXjRIkrR8DwQEn+CBGtuVyYLYO3NSDirzkyoKUK9KaruwOBYpvgkAxy8xVS4rQO+gJjIidbZjalaktBZLuy0pQ2dA/Qn/0Pe6DC91ybFuYF0+x8iAXGSwEF43xOeUuEIXTqvtgiOJ8oOTm6BJ57pBfB6PlFOOCjPdylu6GWDkyw=">https://slurm.schedmd.com/gres.html#GPU_Management</a>,
I was wondering if you can share the output of your job so we can search CUDA_VISIBLE_DEVICES and see how GPUs were allocated.
<div><br>
<div>
<div dir="ltr" class="x_gmail_signature" data-smartmail="gmail_signature">
<div dir="ltr">--Junchao Zhang</div>
</div>
</div>
<br>
</div>
</div>
<br>
<div class="x_gmail_quote">
<div dir="ltr" class="x_gmail_attr">On Mon, Aug 21, 2023 at 2:38 PM Vanella, Marcos (Fed) <<a href="mailto:marcos.vanella@nist.gov">marcos.vanella@nist.gov</a>> wrote:<br>
</div>
<blockquote class="x_gmail_quote" style="margin:0px 0px 0px 0.8ex; border-left:1px solid rgb(204,204,204); padding-left:1ex">
<div class="x_msg3869060330462788085">
<div dir="ltr">
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
Ok thanks Junchao, so is GPU 0 actually allocating memory for the 8 MPI processes meshes but only working on 2 of them? </div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
It says in the script it has allocated 2.4GB</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
Best,</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
Marcos<br>
</div>
<div id="x_m_3869060330462788085appendonsend"></div>
<hr style="display:inline-block; width:98%">
<div id="x_m_3869060330462788085divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> Junchao Zhang <<a href="mailto:junchao.zhang@gmail.com" target="_blank">junchao.zhang@gmail.com</a>><br>
<b>Sent:</b> Monday, August 21, 2023 3:29 PM<br>
<b>To:</b> Vanella, Marcos (Fed) <<a href="mailto:marcos.vanella@nist.gov" target="_blank">marcos.vanella@nist.gov</a>><br>
<b>Cc:</b> PETSc users list <<a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a>>; Guan, Collin X. (Fed) <<a href="mailto:collin.guan@nist.gov" target="_blank">collin.guan@nist.gov</a>><br>
<b>Subject:</b> Re: [petsc-users] CUDA error trying to run a job with two mpi processes and 1 GPU</font>
<div> </div>
</div>
<div>
<div dir="ltr">Hi, Macros,
<div> If you look at the PIDs of the nvidia-smi output, you will only find 8 unique PIDs, which is expected since you allocated 8 MPI ranks per node.</div>
<div> The duplicate PIDs are usually for threads spawned by the MPI runtime (for example, progress threads in MPI implementation). So your job script and output are all good.<br>
<div><br>
</div>
</div>
<div> Thanks.</div>
</div>
<br>
<div>
<div dir="ltr">On Mon, Aug 21, 2023 at 2:00 PM Vanella, Marcos (Fed) <<a href="mailto:marcos.vanella@nist.gov" target="_blank">marcos.vanella@nist.gov</a>> wrote:<br>
</div>
<blockquote style="margin:0px 0px 0px 0.8ex; border-left:1px solid rgb(204,204,204); padding-left:1ex">
<div>
<div dir="ltr">
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
Hi Junchao, something I'm noting related to running with cuda enabled linear solvers (CG+HYPRE, CG+GAMG) is that for multi cpu-multi gpu calculations, the GPU 0 in the node is taking what seems to be all sub-matrices corresponding to all the MPI processes in
the node. This is the result of the nvidia-smi command on a node with 8 MPI processes (each advancing the same number of unknowns in the calculation) and 4 GPU V100s:</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<span style="font-family:"Courier New",monospace">Mon Aug 21 14:36:07 2023 </span>
<div><span style="font-family:"Courier New",monospace">+---------------------------------------------------------------------------------------+</span></div>
<div><span style="font-family:"Courier New",monospace">| NVIDIA-SMI 535.54.03 Driver Version: 535.54.03 CUDA Version: 12.2 |</span></div>
<div><span style="font-family:"Courier New",monospace">|-----------------------------------------+----------------------+----------------------+</span></div>
<div><span style="font-family:"Courier New",monospace">| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |</span></div>
<div><span style="font-family:"Courier New",monospace">| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |</span></div>
<div><span style="font-family:"Courier New",monospace">| | | MIG M. |</span></div>
<div><span style="font-family:"Courier New",monospace">|=========================================+======================+======================|</span></div>
<div><span style="font-family:"Courier New",monospace">| 0 Tesla V100-SXM2-16GB On | 00000004:04:00.0 Off | 0 |</span></div>
<div><span style="font-family:"Courier New",monospace">| N/A 34C P0 63W / 300W | 2488MiB / 16384MiB | 0% Default |</span></div>
<div><span style="font-family:"Courier New",monospace">| | | N/A |</span></div>
<div><span style="font-family:"Courier New",monospace">+-----------------------------------------+----------------------+----------------------+</span></div>
<div><span style="font-family:"Courier New",monospace">| 1 Tesla V100-SXM2-16GB On | 00000004:05:00.0 Off | 0 |</span></div>
<div><span style="font-family:"Courier New",monospace">| N/A 38C P0 56W / 300W | 638MiB / 16384MiB | 0% Default |</span></div>
<div><span style="font-family:"Courier New",monospace">| | | N/A |</span></div>
<div><span style="font-family:"Courier New",monospace">+-----------------------------------------+----------------------+----------------------+</span></div>
<div><span style="font-family:"Courier New",monospace">| 2 Tesla V100-SXM2-16GB On | 00000035:03:00.0 Off | 0 |</span></div>
<div><span style="font-family:"Courier New",monospace">| N/A 35C P0 52W / 300W | 638MiB / 16384MiB | 0% Default |</span></div>
<div><span style="font-family:"Courier New",monospace">| | | N/A |</span></div>
<div><span style="font-family:"Courier New",monospace">+-----------------------------------------+----------------------+----------------------+</span></div>
<div><span style="font-family:"Courier New",monospace">| 3 Tesla V100-SXM2-16GB On | 00000035:04:00.0 Off | 0 |</span></div>
<div><span style="font-family:"Courier New",monospace">| N/A 38C P0 53W / 300W | 638MiB / 16384MiB | 0% Default |</span></div>
<div><span style="font-family:"Courier New",monospace">| | | N/A |</span></div>
<div><span style="font-family:"Courier New",monospace">+-----------------------------------------+----------------------+----------------------+</span></div>
<div><span style="font-family:"Courier New",monospace"> </span></div>
<div><span style="font-family:"Courier New",monospace">+---------------------------------------------------------------------------------------+</span></div>
<div><span style="font-family:"Courier New",monospace">| Processes: |</span></div>
<div><span style="font-family:"Courier New",monospace">| GPU GI CI PID Type Process name GPU Memory |</span></div>
<div><span style="font-family:"Courier New",monospace">| ID ID Usage |</span></div>
<div><span style="font-family:"Courier New",monospace">|=======================================================================================|</span></div>
<div><span style="font-family:"Courier New",monospace">| 0 N/A N/A 214626 C ...d/ompi_gnu_linux/fds_ompi_gnu_linux 318MiB |</span></div>
<div><span style="font-family:"Courier New",monospace">| 0 N/A N/A 214627 C ...d/ompi_gnu_linux/fds_ompi_gnu_linux 308MiB |</span></div>
<div><span style="font-family:"Courier New",monospace">| 0 N/A N/A 214628 C ...d/ompi_gnu_linux/fds_ompi_gnu_linux 308MiB |</span></div>
<div><span style="font-family:"Courier New",monospace">| 0 N/A N/A 214629 C ...d/ompi_gnu_linux/fds_ompi_gnu_linux 308MiB |</span></div>
<div><span style="font-family:"Courier New",monospace">| 0 N/A N/A 214630 C ...d/ompi_gnu_linux/fds_ompi_gnu_linux 318MiB |</span></div>
<div><span style="font-family:"Courier New",monospace">| 0 N/A N/A 214631 C ...d/ompi_gnu_linux/fds_ompi_gnu_linux 308MiB |</span></div>
<div><span style="font-family:"Courier New",monospace">| 0 N/A N/A 214632 C ...d/ompi_gnu_linux/fds_ompi_gnu_linux 308MiB |</span></div>
<div><span style="font-family:"Courier New",monospace">| 0 N/A N/A 214633 C ...d/ompi_gnu_linux/fds_ompi_gnu_linux 308MiB |</span></div>
<div><span style="font-family:"Courier New",monospace">| 1 N/A N/A 214627 C ...d/ompi_gnu_linux/fds_ompi_gnu_linux 318MiB |</span></div>
<div><span style="font-family:"Courier New",monospace">| 1 N/A N/A 214631 C ...d/ompi_gnu_linux/fds_ompi_gnu_linux 318MiB |</span></div>
<div><span style="font-family:"Courier New",monospace">| 2 N/A N/A 214628 C ...d/ompi_gnu_linux/fds_ompi_gnu_linux 318MiB |</span></div>
<div><span style="font-family:"Courier New",monospace">| 2 N/A N/A 214632 C ...d/ompi_gnu_linux/fds_ompi_gnu_linux 318MiB |</span></div>
<div><span style="font-family:"Courier New",monospace">| 3 N/A N/A 214629 C ...d/ompi_gnu_linux/fds_ompi_gnu_linux 318MiB |</span></div>
<div><span style="font-family:"Courier New",monospace">| 3 N/A N/A 214633 C ...d/ompi_gnu_linux/fds_ompi_gnu_linux 318MiB |</span></div>
<span style="font-family:"Courier New",monospace">+---------------------------------------------------------------------------------------+</span><br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
You can see that GPU 0 is connected to all 8 MPI Processes, each taking about 300MB on it, whereas GPUs 1,2 and 3 are working with 2 MPI Processes. I'm wondering if this is expected or there are some changes I need to do on my submission script/runtime parameters.</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
This is the script in this case (2 nodes, 8 MPI processes/node, 4 GPU/node):</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<div><span style="font-family:"Courier New",monospace">#!/bin/bash</span></div>
<div><span style="font-family:"Courier New",monospace"># ../../Utilities/Scripts/qfds.sh -p 2 -T db -d test.fds</span></div>
<div><span style="font-family:"Courier New",monospace">#SBATCH -J test </span></div>
<div><span style="font-family:"Courier New",monospace">#SBATCH -e /home/mnv/Firemodels_fork/fds/Issues/PETSc/test.err</span></div>
<div><span style="font-family:"Courier New",monospace">#SBATCH -o /home/mnv/Firemodels_fork/fds/Issues/PETSc/test.log</span></div>
<div><span style="font-family:"Courier New",monospace">#SBATCH --partition=gpu</span></div>
<div><span style="font-family:"Courier New",monospace">#SBATCH --ntasks=16</span></div>
<div><span style="font-family:"Courier New",monospace">#SBATCH --ntasks-per-node=8</span></div>
<div><span style="font-family:"Courier New",monospace">#SBATCH --cpus-per-task=1</span></div>
<div><span style="font-family:"Courier New",monospace">#SBATCH --nodes=2</span></div>
<div><span style="font-family:"Courier New",monospace">#SBATCH --time=01:00:00</span></div>
<div><span style="font-family:"Courier New",monospace">#SBATCH --gres=gpu:4</span></div>
<br>
<div><span style="font-family:"Courier New",monospace">export OMP_NUM_THREADS=1</span></div>
<div><span style="font-family:"Courier New",monospace"># modules</span></div>
<div><span style="font-family:"Courier New",monospace">module load cuda/11.7</span></div>
<div><span style="font-family:"Courier New",monospace">module load gcc/11.2.1/toolset</span></div>
<div><span style="font-family:"Courier New",monospace">module load openmpi/4.1.4/gcc-11.2.1-cuda-11.7</span></div>
<div><br>
</div>
<div><span style="font-family:"Courier New",monospace">cd /home/mnv/Firemodels_fork/fds/Issues/PETSc</span></div>
<div><br>
</div>
<div></div>
<span style="font-family:"Courier New",monospace">srun -N 2 -n 16 /home/mnv/Firemodels_fork/fds/Build/ompi_gnu_linux/fds_ompi_gnu_linux test.fds -pc_type gamg -mat_type aijcusparse -vec_type cuda</span>
<div></div>
<span style="font-family:"Courier New",monospace"> </span><br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
Thank you for the advice,</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
Marcos<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<div id="x_m_3869060330462788085x_m_-2525567993800845248appendonsend"></div>
<br>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</body>
</html>