<html><head><meta http-equiv="Content-Type" content="text/html; charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class=""><br class=""></div> Thanks Jeff,<div class=""><br class=""></div><div class=""> The information is eventually there somewhere, the issue is more getting the information in a simple way, automatically, at PETSc configure time that is portable and will never crash. <a href="https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__DEVICE.html" class="">https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__DEVICE.html</a> seems to require compiling a program and running it to get the information, this means invoking nvcc (what sub compiler to use for nvcc with what flags etc? Not so easy on systems like Summit where their are multiple choices for the sub-compiler). So how complicated and fragile do we want to make PETSc configure be (for each particular piece of hardware) to always get the best information about the current hardware? </div><div class=""><br class=""></div><div class=""> It looks like Kokkos really only needs the NVIDIA numerical generation information, not the code name, but their API requires both the codename (irrelevant) and the numerical information (relevant) in what is passed to Kokkos. The problem has always been generating the irrelevant part so Kokkos does not complain. We can, with a little pain, possibly automate completely the CUDA device information, the numerical part, but the mapping to code name has been problematic because it is hard to find in a single place the mapping from numerical information to codename. But I think, thanks to Max's input, I now understand the mapping and have put it in PETSc's configure.</div><div class=""><br class=""></div><div class=""> Generically, independent of Kokkos, ideally I would run a single precompiled NVIDIA program that gave me all the information about the current hardware I was running and that would provide in simple format exactly the information I needed to configure PETSc, Kokkos etc for THAT system. The idea of support a multitude of hardware is important for package management systems, but is not important for 99% of PETSc users who are configuring for exactly the hardware they have on the system they are configuring on, then all they care about it is "give me the best reasonable performance on the machine I am using today". This means the system software should be able to provide in a trivial way what the current hardware is. The problem is not unique to GPUs, of course, it is not always easy in a portable way to get this information for generic CPUs either.</div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""> Barry</div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><br class=""><div class=""><br class=""></div><div class=""><br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On Apr 5, 2021, at 7:32 PM, Jeff Hammond <<a href="mailto:jeff.science@gmail.com" class="">jeff.science@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">NVCC has supported multi-versioned "fat" binaries since I worked for Argonne. Libraries should figure out what the oldest hardware they are about is and then compile for everything from that point forward. Kepler (3.5) is oldest version any reasonable person should be thinking about at this point. The oldest thing I know of in the DOE HPC fleet is Pascal (6.x). Volta and Turing are 7.x and Ampere is 8.x.<div class=""><br class=""></div><div class="">The biggest architectural changes came with unified memory (<a href="https://developer.nvidia.com/blog/unified-memory-in-cuda-6/" class="">https://developer.nvidia.com/blog/unified-memory-in-cuda-6/</a>) and cooperative (<a href="https://developer.nvidia.com/blog/cooperative-groups/" class="">https://developer.nvidia.com/blog/cooperative-groups/</a> in CUDA 9) but Kokkos doesn't use the latter. Both features can be used on quite old GPU architectures, although the performance is better on newer ones.<div class=""><br class=""></div><div class="">I haven't dug into what Kokkos and PETSc are doing but the direct use of this stuff in CUDA is well-documented, certainly as well as the CPU switches for x86 binaries in the Intel compiler are.<br class=""><div class=""><br class=""></div><div class=""><a href="https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#compute-capabilities" class="">https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#compute-capabilities</a><br class=""></div><div class=""><br class=""></div><div class=""><span style="font-family: "Trebuchet MS", "DIN Pro", sans-serif; font-size: 14px;" class="">Devices with the same major revision number are of the same core architecture. The major revision number is 8 for devices based on the </span><dfn class="gmail-term" style="font-family: "Trebuchet MS", "DIN Pro", sans-serif; font-size: 14px;">NVIDIA Ampere GPU</dfn><span style="font-family: "Trebuchet MS", "DIN Pro", sans-serif; font-size: 14px;" class=""> architecture, 7 for devices based on the </span><dfn class="gmail-term" style="font-family: "Trebuchet MS", "DIN Pro", sans-serif; font-size: 14px;">Volta</dfn><span style="font-family: "Trebuchet MS", "DIN Pro", sans-serif; font-size: 14px;" class=""> architecture, 6 for devices based on the </span><dfn class="gmail-term" style="font-family: "Trebuchet MS", "DIN Pro", sans-serif; font-size: 14px;">Pascal</dfn><span style="font-family: "Trebuchet MS", "DIN Pro", sans-serif; font-size: 14px;" class=""> architecture, 5 for devices based on the </span><dfn class="gmail-term" style="font-family: "Trebuchet MS", "DIN Pro", sans-serif; font-size: 14px;">Maxwell</dfn><span style="font-family: "Trebuchet MS", "DIN Pro", sans-serif; font-size: 14px;" class=""> architecture, 3 for devices based on the </span><dfn class="gmail-term" style="font-family: "Trebuchet MS", "DIN Pro", sans-serif; font-size: 14px;">Kepler</dfn><span style="font-family: "Trebuchet MS", "DIN Pro", sans-serif; font-size: 14px;" class=""> architecture, 2 for devices based on the </span><dfn class="gmail-term" style="font-family: "Trebuchet MS", "DIN Pro", sans-serif; font-size: 14px;">Fermi</dfn><span style="font-family: "Trebuchet MS", "DIN Pro", sans-serif; font-size: 14px;" class=""> architecture, and 1 for devices based on the </span><dfn class="gmail-term" style="font-family: "Trebuchet MS", "DIN Pro", sans-serif; font-size: 14px;">Tesla</dfn><span style="font-family: "Trebuchet MS", "DIN Pro", sans-serif; font-size: 14px;" class=""> architecture.</span><br class=""></div><div class=""><br class=""></div><div class=""><a href="https://docs.nvidia.com/cuda/pascal-compatibility-guide/index.html#building-pascal-compatible-apps-using-cuda-8-0" class="">https://docs.nvidia.com/cuda/pascal-compatibility-guide/index.html#building-pascal-compatible-apps-using-cuda-8-0</a><br class=""></div><div class=""><a href="https://docs.nvidia.com/cuda/volta-compatibility-guide/index.html#building-volta-compatible-apps-using-cuda-9-0" class="">https://docs.nvidia.com/cuda/volta-compatibility-guide/index.html#building-volta-compatible-apps-using-cuda-9-0</a><br class=""></div><div class=""><a href="https://docs.nvidia.com/cuda/turing-compatibility-guide/index.html#building-turing-compatible-apps-using-cuda-10-0" class="">https://docs.nvidia.com/cuda/turing-compatibility-guide/index.html#building-turing-compatible-apps-using-cuda-10-0</a><br class=""></div><div class=""><a href="https://docs.nvidia.com/cuda/ampere-compatibility-guide/index.html#building-ampere-compatible-apps-using-cuda-11-0" class="">https://docs.nvidia.com/cuda/ampere-compatibility-guide/index.html#building-ampere-compatible-apps-using-cuda-11-0</a><br class=""></div><div class=""><br class=""></div><div class="">Programmatic querying can be done with the following (<a href="https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__DEVICE.html" class="">https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__DEVICE.html</a>):</div><div class=""><br class=""></div><div class=""><span style="font-family: "Trebuchet MS", "DIN Pro", sans-serif; font-size: 14px; background-color: rgb(239, 239, 240);" class="">cudaDeviceGetAttribute</span><br class=""></div><div class=""><ul class="gmail-ul" style="font-family: "Trebuchet MS", "DIN Pro", sans-serif; font-size: 14px;"><li class="gmail-li"><div style="margin: 0px;" class=""><a class="gmail-xref" href="https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1gg49e2f8c2c0bd6fe264f2fc970912e5cd220ff111a6616ab512e229d8f2f8bf87" shape="rect" style="color:rgb(118,185,0)">cudaDevAttrComputeCapabilityMajor</a>: Major compute capability version number;</div></li><li class="gmail-li"><div style="margin: 0px;" class=""><a class="gmail-xref" href="https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1gg49e2f8c2c0bd6fe264f2fc970912e5cd2c981c76c9de58d39502e483a7b484c7" shape="rect" style="color:rgb(118,185,0)">cudaDevAttrComputeCapabilityMinor</a>: Minor compute capability version number;</div></li></ul></div><div class="">The compiler help tells me this, which can be cross-referenced with CUDA documentation above.</div><div class=""><br class=""></div><div class=""><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures">$ /usr/local/cuda-10.0/bin/nvcc -h</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0); min-height: 17px;" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"></span><br class=""></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures">Usage<span class="gmail-Apple-converted-space"> </span>: nvcc [options] <inputfile></span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><br class=""></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class="">...</div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><br class=""></div></div><div class=""><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures">Options for steering GPU code generation.</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures">=========================================</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0); min-height: 17px;" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"></span><br class=""></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures">--gpu-architecture <arch><span class="gmail-Apple-converted-space"> </span>(-arch)<span class="gmail-Apple-converted-space"> </span></span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>Specify the name of the class of NVIDIA 'virtual' GPU architecture for which</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>the CUDA input files must be compiled.</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>With the exception as described for the shorthand below, the architecture</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>specified with this option must be a 'virtual' architecture (such as compute_50).</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>Normally, this option alone does not trigger assembly of the generated PTX</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>for a 'real' architecture (that is the role of nvcc option '--gpu-code',</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>see below); rather, its purpose is to control preprocessing and compilation</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>of the input to PTX.</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>For convenience, in case of simple nvcc compilations, the following shorthand</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>is supported.<span class="gmail-Apple-converted-space"> </span>If no value for option '--gpu-code' is specified, then the</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>value of this option defaults to the value of '--gpu-architecture'.<span class="gmail-Apple-converted-space"> </span>In this</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>situation, as only exception to the description above, the value specified</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>for '--gpu-architecture' may be a 'real' architecture (such as a sm_50),</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>in which case nvcc uses the specified 'real' architecture and its closest</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>'virtual' architecture as effective architecture values.<span class="gmail-Apple-converted-space"> </span>For example, 'nvcc</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>--gpu-architecture=sm_50' is equivalent to 'nvcc --gpu-architecture=compute_50</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>--gpu-code=sm_50,compute_50'.</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>Allowed values for this option:<span class="gmail-Apple-converted-space"> </span>'compute_30','compute_32','compute_35',</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>'compute_37','compute_50','compute_52','compute_53','compute_60','compute_61',</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>'compute_62','compute_70','compute_72','compute_75','sm_30','sm_32','sm_35',</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>'sm_37','sm_50','sm_52','sm_53','sm_60','sm_61','sm_62','sm_70','sm_72',</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>'sm_75'.</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0); min-height: 17px;" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"></span><br class=""></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures">--gpu-code <code>,...<span class="gmail-Apple-converted-space"> </span>(-code)<span class="gmail-Apple-converted-space"> </span></span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>Specify the name of the NVIDIA GPU to assemble and optimize PTX for.</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>nvcc embeds a compiled code image in the resulting executable for each specified</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span><code> architecture, which is a true binary load image for each 'real' architecture</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>(such as sm_50), and PTX code for the 'virtual' architecture (such as compute_50).</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>During runtime, such embedded PTX code is dynamically compiled by the CUDA</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>runtime system if no binary load image is found for the 'current' GPU.</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>Architectures specified for options '--gpu-architecture' and '--gpu-code'</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>may be 'virtual' as well as 'real', but the <code> architectures must be</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>compatible with the <arch> architecture.<span class="gmail-Apple-converted-space"> </span>When the '--gpu-code' option is</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>used, the value for the '--gpu-architecture' option must be a 'virtual' PTX</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>architecture.</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>For instance, '--gpu-architecture=compute_35' is not compatible with '--gpu-code=sm_30',</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>because the earlier compilation stages will assume the availability of 'compute_35'</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>features that are not present on 'sm_30'.</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>Allowed values for this option:<span class="gmail-Apple-converted-space"> </span>'compute_30','compute_32','compute_35',</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>'compute_37','compute_50','compute_52','compute_53','compute_60','compute_61',</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>'compute_62','compute_70','compute_72','compute_75','sm_30','sm_32','sm_35',</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>'sm_37','sm_50','sm_52','sm_53','sm_60','sm_61','sm_62','sm_70','sm_72',</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>'sm_75'.</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0); min-height: 17px;" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"></span><br class=""></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures">--generate-code <specification>,...<span class="gmail-Apple-converted-space"> </span>(-gencode) <span class="gmail-Apple-converted-space"> </span></span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>This option provides a generalization of the '--gpu-architecture=<arch> --gpu-code=<code>,</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>...' option combination for specifying nvcc behavior with respect to code</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>generation.<span class="gmail-Apple-converted-space"> </span>Where use of the previous options generates code for different</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>'real' architectures with the PTX for the same 'virtual' architecture, option</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>'--generate-code' allows multiple PTX generations for different 'virtual'</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>architectures.<span class="gmail-Apple-converted-space"> </span>In fact, '--gpu-architecture=<arch> --gpu-code=<code>,</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>...' is equivalent to '--generate-code arch=<arch>,code=<code>,...'.</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>'--generate-code' options may be repeated for different virtual architectures.</span></div><div style="margin: 0px; font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 13px; line-height: normal; font-family: Monaco; color: rgb(242, 242, 242); background-color: rgb(0, 0, 0);" class=""><span class="gmail-s1" style="font-variant-ligatures:no-common-ligatures"><span class="gmail-Apple-converted-space"> </span>Allowed keywords for this option:<span class="gmail-Apple-converted-space"> </span>'arch','code'.</span></div></div></div></div></div><br class=""><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Apr 5, 2021 at 1:19 PM Satish Balay via petsc-dev <<a href="mailto:petsc-dev@mcs.anl.gov" class="">petsc-dev@mcs.anl.gov</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">This is nvidia mess-up. Why isn't there a command that give me these values [if they insist on this interface for nvcc]<br class="">
<br class="">
I see Barry want configure to do something here - but whatever we do - we would be shifting the problem around.<br class="">
[even if we detect stuff - build box might not have the GPU used for runs.]<br class="">
<br class="">
We have --with-cuda-arch - which I tried to remove from configure - but its come back in a different form (--with-cuda-gencodearch)<br class="">
<br class="">
And I see other packages:<br class="">
<br class="">
--with-kokkos-cuda-arch<br class="">
<br class="">
Wrt spack - I'm having to do:<br class="">
<br class="">
spack install xsdk+cuda ^magma cuda_arch=60<br class="">
<br class="">
[magma uses CudaPackage() infrastructure in spack]<br class="">
<br class="">
Satish<br class="">
<br class="">
On Mon, 5 Apr 2021, Mills, Richard Tran via petsc-dev wrote:<br class="">
<br class="">
> You raise a good point, Barry. I've been completely mystified by what some of these names even mean. What does "PASCAL60" vs. "PASCAL61" even mean? Do you know of where this is even documented? I can't really find anything about it in the Kokkos documentation. The only thing I can really find is an issue or two about "hey, shouldn't our CMake stuff figure this out automatically" and then some posts about why it can't really do that. Not encouraging.<br class="">
> <br class="">
> --Richard<br class="">
> <br class="">
> On 4/3/21 8:42 PM, Barry Smith wrote:<br class="">
> <br class="">
> <br class="">
> It would be very nice to NOT require PETSc users to provide this flag, how the heck will they know what it should be when we cannot automate it ourselves?<br class="">
> <br class="">
> Any ideas of how this can be determined based on the current system? NVIDIA does not help since these "advertising" names don't seem to trivially map to information you can get from a particular GPU when you logged into it. For example nvidia-smi doesn't use these names directly. Is there some mapping from nvidia-smi to these names we could use? If we are serious about having a non-trivial number of users utilizing GPUs, which we need to be for future, we cannot have this absurd demands in our installation process.<br class="">
> <br class="">
> Barry<br class="">
> <br class="">
> Does spack have some magic for this we could use?<br class="">
> <br class="">
> <br class="">
> <br class="">
> <br class="">
<br class="">
</blockquote></div><br clear="all" class=""><div class=""><br class=""></div>-- <br class=""><div dir="ltr" class="gmail_signature">Jeff Hammond<br class=""><a href="mailto:jeff.science@gmail.com" target="_blank" class="">jeff.science@gmail.com</a><br class=""><a href="http://jeffhammond.github.io/" target="_blank" class="">http://jeffhammond.github.io/</a></div>
</div></blockquote></div><br class=""></div></div></body></html>