<html>

<head>

<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">

</head>

<body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">

<span class="">When PETSc is initialized, it takes about 2GB CUDA memory. This is way too much for doing nothing. A test script is attached to reproduce the issue. If I remove the first line "import torch", PETSc consumes about 0.73GB, which is still significant.

 Does anyone have any idea about this behavior?</span>

<div class=""><br class="">

</div>

<div class="">Thanks,</div>

<div class="">Hong<br class="">

<div class=""><br class="">

</div>

<div class="">

<pre class="c-mrkdwn__pre" data-stringify-type="pre" style="box-sizing: inherit; margin-top: 4px; margin-bottom: 4px; padding: 8px; --saf-0: rgba(var(--sk_foreground_low,29,28,29),0.13); font-size: 12px; line-height: 1.50001; font-variant-ligatures: none; white-space: pre-wrap; overflow-wrap: break-word; word-break: normal; tab-size: 4; border: 1px solid var(--saf-0); border-top-left-radius: 4px; border-top-right-radius: 4px; border-bottom-right-radius: 4px; border-bottom-left-radius: 4px; background: rgba(var(--sk_foreground_min,29,28,29),0.04); counter-reset: list-0 0 list-1 0 list-2 0 list-3 0 list-4 0 list-5 0 list-6 0 list-7 0 list-8 0 list-9 0; color: rgb(29, 28, 29); orphans: 2; widows: 2; text-decoration-thickness: initial; font-family: Monaco, Menlo, Consolas, "Courier New", monospace !important;">hongzhang@gpu02:/gpfs/jlse-fs0/users/hongzhang/Projects/pnode/examples (caidao22/update-examples)$ python3 test.py

CUDA memory before PETSc 0.000GB

CUDA memory after PETSc 0.004GB

hongzhang@gpu02:/gpfs/jlse-fs0/users/hongzhang/Projects/pnode/examples (caidao22/update-examples)$ python3 test.py -log_view :0.txt

CUDA memory before PETSc 0.000GB

CUDA memory after PETSc 1.936GB</pre>

<div class=""><br class="">

</div>

</div>

<div class="">

<pre class="c-mrkdwn__pre" data-stringify-type="pre" style="box-sizing: inherit; margin-top: 4px; margin-bottom: 4px; padding: 8px; --saf-0: rgba(var(--sk_foreground_low,29,28,29),0.13); font-size: 12px; line-height: 1.50001; font-variant-ligatures: none; white-space: pre-wrap; overflow-wrap: break-word; word-break: normal; tab-size: 4; border: 1px solid var(--saf-0); border-top-left-radius: 4px; border-top-right-radius: 4px; border-bottom-right-radius: 4px; border-bottom-left-radius: 4px; background: rgba(var(--sk_foreground_min,29,28,29),0.04); counter-reset: list-0 0 list-1 0 list-2 0 list-3 0 list-4 0 list-5 0 list-6 0 list-7 0 list-8 0 list-9 0; color: rgb(29, 28, 29); orphans: 2; widows: 2; text-decoration-thickness: initial; font-family: Monaco, Menlo, Consolas, "Courier New", monospace !important;">import torch

import sys

import os

import nvidia_smi

nvidia_smi.nvmlInit()

handle = nvidia_smi.nvmlDeviceGetHandleByIndex(0)

info = nvidia_smi.nvmlDeviceGetMemoryInfo(handle)

print('CUDA memory before PETSc %.3fGB' % (info.used/1e9))

petsc4py_path = os.path.join(os.environ['PETSC_DIR'],os.environ['PETSC_ARCH'],'lib')

sys.path.append(petsc4py_path)

import petsc4py

petsc4py.init(sys.argv)

handle = nvidia_smi.nvmlDeviceGetHandleByIndex(0)

info = nvidia_smi.nvmlDeviceGetMemoryInfo(handle)

print('CUDA memory after PETSc %.3fGB' % (info.used/1e9))</pre>

<div class=""><br class="">

</div>

</div>

</div>

</body>

</html>