<div dir="ltr"><div dir="ltr"><br><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Apr 5, 2021 at 7:33 PM Jeff Hammond <<a href="mailto:jeff.science@gmail.com">jeff.science@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">NVCC has supported multi-versioned "fat" binaries since I worked for Argonne.  Libraries should figure out what the oldest hardware they are about is and then compile for everything from that point forward.  Kepler (3.5) is oldest version any reasonable person should be thinking about at this point.  The oldest thing I know of in the DOE HPC fleet is Pascal (6.x).  Volta and Turing are 7.x and Ampere is 8.x.<div><br></div><div>The biggest architectural changes came with unified memory (<a href="https://developer.nvidia.com/blog/unified-memory-in-cuda-6/" target="_blank">https://developer.nvidia.com/blog/unified-memory-in-cuda-6/</a>) and cooperative (<a href="https://developer.nvidia.com/blog/cooperative-groups/" target="_blank">https://developer.nvidia.com/blog/cooperative-groups/</a> in CUDA 9) but Kokkos doesn't use the latter.  Both features can be used on quite old GPU architectures, although the performance is better on newer ones.<div><br></div><div>I haven't dug into what Kokkos and PETSc are doing but the direct use of this stuff in CUDA is well-documented, certainly as well as the CPU switches for x86 binaries in the Intel compiler are.<br><div><br></div><div><a href="https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#compute-capabilities" target="_blank">https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#compute-capabilities</a><br></div><div><br></div><div><span style="color:rgb(0,0,0);font-family:"Trebuchet MS","DIN Pro",sans-serif;font-size:14px">Devices with the same major revision number are of the same core architecture. The major revision number is 8 for devices based on the </span><dfn style="color:rgb(0,0,0);font-family:"Trebuchet MS","DIN Pro",sans-serif;font-size:14px">NVIDIA Ampere GPU</dfn><span style="color:rgb(0,0,0);font-family:"Trebuchet MS","DIN Pro",sans-serif;font-size:14px"> architecture, 7 for devices based on the </span><dfn style="color:rgb(0,0,0);font-family:"Trebuchet MS","DIN Pro",sans-serif;font-size:14px">Volta</dfn><span style="color:rgb(0,0,0);font-family:"Trebuchet MS","DIN Pro",sans-serif;font-size:14px"> architecture, 6 for devices based on the </span><dfn style="color:rgb(0,0,0);font-family:"Trebuchet MS","DIN Pro",sans-serif;font-size:14px">Pascal</dfn><span style="color:rgb(0,0,0);font-family:"Trebuchet MS","DIN Pro",sans-serif;font-size:14px"> architecture, 5 for devices based on the </span><dfn style="color:rgb(0,0,0);font-family:"Trebuchet MS","DIN Pro",sans-serif;font-size:14px">Maxwell</dfn><span style="color:rgb(0,0,0);font-family:"Trebuchet MS","DIN Pro",sans-serif;font-size:14px"> architecture, 3 for devices based on the </span><dfn style="color:rgb(0,0,0);font-family:"Trebuchet MS","DIN Pro",sans-serif;font-size:14px">Kepler</dfn><span style="color:rgb(0,0,0);font-family:"Trebuchet MS","DIN Pro",sans-serif;font-size:14px"> architecture, 2 for devices based on the </span><dfn style="color:rgb(0,0,0);font-family:"Trebuchet MS","DIN Pro",sans-serif;font-size:14px">Fermi</dfn><span style="color:rgb(0,0,0);font-family:"Trebuchet MS","DIN Pro",sans-serif;font-size:14px"> architecture, and 1 for devices based on the </span><dfn style="color:rgb(0,0,0);font-family:"Trebuchet MS","DIN Pro",sans-serif;font-size:14px">Tesla</dfn><span style="color:rgb(0,0,0);font-family:"Trebuchet MS","DIN Pro",sans-serif;font-size:14px"> architecture.</span></div></div></div></div></blockquote>Kokkos has config options Kokkos_ARCH_TURING75, Kokkos_ARCH_VOLTA70, Kokkos_ARCH_VOLTA72.    Any idea how one can map compute capability versions to arch names?<div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><div><div><br></div><div><br></div><div><a href="https://docs.nvidia.com/cuda/pascal-compatibility-guide/index.html#building-pascal-compatible-apps-using-cuda-8-0" target="_blank">https://docs.nvidia.com/cuda/pascal-compatibility-guide/index.html#building-pascal-compatible-apps-using-cuda-8-0</a><br></div><div><a href="https://docs.nvidia.com/cuda/volta-compatibility-guide/index.html#building-volta-compatible-apps-using-cuda-9-0" target="_blank">https://docs.nvidia.com/cuda/volta-compatibility-guide/index.html#building-volta-compatible-apps-using-cuda-9-0</a><br></div><div><a href="https://docs.nvidia.com/cuda/turing-compatibility-guide/index.html#building-turing-compatible-apps-using-cuda-10-0" target="_blank">https://docs.nvidia.com/cuda/turing-compatibility-guide/index.html#building-turing-compatible-apps-using-cuda-10-0</a><br></div><div><a href="https://docs.nvidia.com/cuda/ampere-compatibility-guide/index.html#building-ampere-compatible-apps-using-cuda-11-0" target="_blank">https://docs.nvidia.com/cuda/ampere-compatibility-guide/index.html#building-ampere-compatible-apps-using-cuda-11-0</a><br></div><div><br></div><div>Programmatic querying can be done with the following (<a href="https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__DEVICE.html" target="_blank">https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__DEVICE.html</a>):</div><div><br></div><div><span style="color:rgb(0,0,0);font-family:"Trebuchet MS","DIN Pro",sans-serif;font-size:14px;background-color:rgb(239,239,240)">cudaDeviceGetAttribute</span><br></div><div><ul style="color:rgb(0,0,0);font-family:"Trebuchet MS","DIN Pro",sans-serif;font-size:14px"><li><p style="margin:0px"><a href="https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1gg49e2f8c2c0bd6fe264f2fc970912e5cd220ff111a6616ab512e229d8f2f8bf87" shape="rect" style="color:rgb(118,185,0)" target="_blank">cudaDevAttrComputeCapabilityMajor</a>: Major compute capability version number;</p></li><li><p style="margin:0px"><a href="https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1gg49e2f8c2c0bd6fe264f2fc970912e5cd2c981c76c9de58d39502e483a7b484c7" shape="rect" style="color:rgb(118,185,0)" target="_blank">cudaDevAttrComputeCapabilityMinor</a>: Minor compute capability version number;</p></li></ul></div><div>The compiler help tells me this, which can be cross-referenced with CUDA documentation above.</div><div><br></div><div>





<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">$ /usr/local/cuda-10.0/bin/nvcc -h</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0);min-height:17px"><span style="font-variant-ligatures:no-common-ligatures"></span><br></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">Usage<span>  </span>: nvcc [options] <inputfile></span></p><p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><br></p><p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)">...</p><p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><br></p></div><div>





<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">Options for steering GPU code generation.</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">=========================================</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0);min-height:17px"><span style="font-variant-ligatures:no-common-ligatures"></span><br></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">--gpu-architecture <arch><span>                  </span>(-arch)<span>                         </span></span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>Specify the name of the class of NVIDIA 'virtual' GPU architecture for which</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>the CUDA input files must be compiled.</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>With the exception as described for the shorthand below, the architecture</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>specified with this option must be a 'virtual' architecture (such as compute_50).</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>Normally, this option alone does not trigger assembly of the generated PTX</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>for a 'real' architecture (that is the role of nvcc option '--gpu-code',</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>see below); rather, its purpose is to control preprocessing and compilation</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>of the input to PTX.</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>For convenience, in case of simple nvcc compilations, the following shorthand</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>is supported.<span>  </span>If no value for option '--gpu-code' is specified, then the</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>value of this option defaults to the value of '--gpu-architecture'.<span>  </span>In this</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>situation, as only exception to the description above, the value specified</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>for '--gpu-architecture' may be a 'real' architecture (such as a sm_50),</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>in which case nvcc uses the specified 'real' architecture and its closest</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>'virtual' architecture as effective architecture values.<span>  </span>For example, 'nvcc</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>--gpu-architecture=sm_50' is equivalent to 'nvcc --gpu-architecture=compute_50</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>--gpu-code=sm_50,compute_50'.</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>Allowed values for this option:<span>  </span>'compute_30','compute_32','compute_35',</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>'compute_37','compute_50','compute_52','compute_53','compute_60','compute_61',</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>'compute_62','compute_70','compute_72','compute_75','sm_30','sm_32','sm_35',</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>'sm_37','sm_50','sm_52','sm_53','sm_60','sm_61','sm_62','sm_70','sm_72',</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>'sm_75'.</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0);min-height:17px"><span style="font-variant-ligatures:no-common-ligatures"></span><br></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">--gpu-code <code>,...<span>                      </span>(-code)<span>                         </span></span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>Specify the name of the NVIDIA GPU to assemble and optimize PTX for.</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>nvcc embeds a compiled code image in the resulting executable for each specified</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span><code> architecture, which is a true binary load image for each 'real' architecture</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>(such as sm_50), and PTX code for the 'virtual' architecture (such as compute_50).</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>During runtime, such embedded PTX code is dynamically compiled by the CUDA</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>runtime system if no binary load image is found for the 'current' GPU.</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>Architectures specified for options '--gpu-architecture' and '--gpu-code'</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>may be 'virtual' as well as 'real', but the <code> architectures must be</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>compatible with the <arch> architecture.<span>  </span>When the '--gpu-code' option is</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>used, the value for the '--gpu-architecture' option must be a 'virtual' PTX</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>architecture.</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>For instance, '--gpu-architecture=compute_35' is not compatible with '--gpu-code=sm_30',</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>because the earlier compilation stages will assume the availability of 'compute_35'</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>features that are not present on 'sm_30'.</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>Allowed values for this option:<span>  </span>'compute_30','compute_32','compute_35',</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>'compute_37','compute_50','compute_52','compute_53','compute_60','compute_61',</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>'compute_62','compute_70','compute_72','compute_75','sm_30','sm_32','sm_35',</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>'sm_37','sm_50','sm_52','sm_53','sm_60','sm_61','sm_62','sm_70','sm_72',</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>'sm_75'.</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0);min-height:17px"><span style="font-variant-ligatures:no-common-ligatures"></span><br></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">--generate-code <specification>,...<span>        </span>(-gencode) <span>                     </span></span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>This option provides a generalization of the '--gpu-architecture=<arch> --gpu-code=<code>,</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>...' option combination for specifying nvcc behavior with respect to code</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>generation.<span>  </span>Where use of the previous options generates code for different</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>'real' architectures with the PTX for the same 'virtual' architecture, option</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>'--generate-code' allows multiple PTX generations for different 'virtual'</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>architectures.<span>  </span>In fact, '--gpu-architecture=<arch> --gpu-code=<code>,</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>...' is equivalent to '--generate-code arch=<arch>,code=<code>,...'.</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>'--generate-code' options may be repeated for different virtual architectures.</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:13px;line-height:normal;font-family:Monaco;color:rgb(242,242,242);background-color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>        </span>Allowed keywords for this option:<span>  </span>'arch','code'.</span></p></div></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Apr 5, 2021 at 1:19 PM Satish Balay via petsc-dev <<a href="mailto:petsc-dev@mcs.anl.gov" target="_blank">petsc-dev@mcs.anl.gov</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">This is nvidia mess-up. Why isn't there a command that give me these values [if they insist on this interface for nvcc]<br>
<br>
I see Barry want configure to do something here - but whatever we do - we would be shifting the problem around.<br>
[even if we detect stuff - build box might not have the GPU used for runs.]<br>
<br>
We have --with-cuda-arch - which I tried to remove from configure - but its come back in a different form (--with-cuda-gencodearch)<br>
<br>
And I see other packages:<br>
<br>
  --with-kokkos-cuda-arch<br>
<br>
Wrt spack - I'm having to do:<br>
<br>
spack install xsdk+cuda ^magma cuda_arch=60<br>
<br>
[magma uses CudaPackage() infrastructure in spack]<br>
<br>
Satish<br>
<br>
On Mon, 5 Apr 2021, Mills, Richard Tran via petsc-dev wrote:<br>
<br>
> You raise a good point, Barry. I've been completely mystified by what some of these names even mean. What does "PASCAL60" vs. "PASCAL61" even mean? Do you know of where this is even documented? I can't really find anything about it in the Kokkos documentation. The only thing I can really find is an issue or two about "hey, shouldn't our CMake stuff figure this out automatically" and then some posts about why it can't really do that. Not encouraging.<br>
> <br>
> --Richard<br>
> <br>
> On 4/3/21 8:42 PM, Barry Smith wrote:<br>
> <br>
> <br>
>   It would be very nice to NOT require PETSc users to provide this flag, how the heck will they know what it should be when we cannot automate it ourselves?<br>
> <br>
>   Any ideas of how this can be determined based on the current system? NVIDIA does not help since these "advertising" names don't seem to trivially map to information you can get from a particular GPU when you logged into it. For example nvidia-smi doesn't use these names directly. Is there some mapping from nvidia-smi  to these names we could use? If we are serious about having a non-trivial number of users utilizing GPUs, which we need to be for future, we cannot have this absurd demands in our installation process.<br>
> <br>
>   Barry<br>
> <br>
> Does spack have some magic for this we could use?<br>
> <br>
> <br>
> <br>
> <br>
<br>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr">Jeff Hammond<br><a href="mailto:jeff.science@gmail.com" target="_blank">jeff.science@gmail.com</a><br><a href="http://jeffhammond.github.io/" target="_blank">http://jeffhammond.github.io/</a></div>
</blockquote></div></div>