如何使用Python和Numba获取GPU中CUDA内核的数量?
问题描述:
我想知道如何使用Python,Numba和cudatoolkit在我的GPU中获得CUDA内核的总数.
I would like to know how to obtain the total number of CUDA Cores in my GPU using Python, Numba and cudatoolkit.
答
通过组合此答案.
我们将使用第一个答案来指示如何获得设备计算能力以及流式多处理器的数量.我们将使用第二个答案(转换为python)来使用计算功能来获取核心"代码.每个SM计数,然后将其乘以SM数量.
We'll use the first answer to indicate how to get the device compute capability and also the number of streaming multiprocessors. We'll use the second answer (converted to python) to use the compute capability to get the "core" count per SM, then multiply that by the number of SMs.
这是一个完整的例子:
$ cat t36.py
from numba import cuda
cc_cores_per_SM_dict = {
(2,0) : 32,
(2,1) : 48,
(3,0) : 192,
(3,5) : 192,
(3,7) : 192,
(5,0) : 128,
(5,2) : 128,
(6,0) : 64,
(6,1) : 128,
(7,0) : 64,
(7,5) : 64,
(8,0) : 64,
(8,6) : 128
}
# the above dictionary should result in a value of "None" if a cc match
# is not found. The dictionary needs to be extended as new devices become
# available, and currently does not account for all Jetson devices
device = cuda.get_current_device()
my_sms = getattr(device, 'MULTIPROCESSOR_COUNT')
my_cc = getattr(device, 'COMPUTE_CAPABILITY')
cores_per_sm = cc_cores_per_SM_dict.get(my_cc)
total_cores = cores_per_sm*my_sms
print("GPU compute capability: " , my_cc)
print("GPU total number of SMs: " , my_sms)
print("total cores: " , total_cores)
$ python t36.py
GPU compute capability: (5, 2)
GPU total number of SMs: 8
total cores: 1024
$