获取共享缓存的逻辑 CPU 内核数(L1、L2、L3)

问题描述：

下面是一些 C++ 代码，它使用 GetLogicalProcessorInformation:

Below is some C++ code that detects the size of the L1, L2 and L3 CPU caches on Windows using GetLogicalProcessorInformation:

typedef BOOL (WINAPI *LPFN_GLPI)(PSYSTEM_LOGICAL_PROCESSOR_INFORMATION, PDWORD);

LPFN_GLPI glpi = (LPFN_GLPI) GetProcAddress(
    GetModuleHandle(TEXT("kernel32")), "GetLogicalProcessorInformation");

if (glpi)
{
    DWORD bytes = 0;
    glpi(0, &bytes);
    size_t size = bytes / sizeof(SYSTEM_LOGICAL_PROCESSOR_INFORMATION);
    vector<SYSTEM_LOGICAL_PROCESSOR_INFORMATION> info(size);
    glpi(info.data(), &bytes);

    for (size_t i = 0; i < size; i++)
    {
        if (info[i].Relationship == RelationCache)
        {
            if (info[i].Cache.Level == 1)
              l1_cache_Size = info[i].Cache.Size;
            if (info[i].Cache.Level == 2)
              l2_cache_Size = info[i].Cache.Size;
            if (info[i].Cache.Level == 3)
              l3_cache_Size = info[i].Cache.Size;
        }
    }
}

下一步，我想获取共享缓存的逻辑 CPU 内核的数量.在具有超线程的 x64 CPU 上，两个逻辑 CPU 内核通常共享 L2 缓存，所有逻辑 CPU 内核共享 L3 缓存.

As a next step I would like to get the number of logical CPU cores sharing a cache. On a x64 CPU with hyper-threading two logical CPU cores usually share an L2 cache and all logical CPU cores share the L3 cache.

阅读 MSDN 后，我认为 GetLogicalProcessorInformationEx 和 CACHE_RELATIONSHIP 和 GROUP_AFFINITY 我一直在寻找的数据结构，但在尝试之后，这些数据结构对我来说似乎毫无用处.

After reading through MSDN I thought that GetLogicalProcessorInformationEx and CACHE_RELATIONSHIP and GROUP_AFFINITY where the data structures I was looking for but after trying it out these data structures seem useless for my purpose.

问题:

有没有办法在 Windows 上使用 C/C++ 获取共享缓存的逻辑 CPU 内核的数量?(理想情况下不直接使用 cpuid)

Is there a way to get the number of logical CPU cores sharing a cache on Windows using C/C++? (Ideally without using cpuid directly)

解决方案:

可以使用 GetLogicalProcessorInformationEx 和 CACHE_RELATIONSHIP 和 GROUP_AFFINITY 数据结构.GROUP_AFFINITY.Mask 值包含为共享当前缓存 (RelationCache) 的每个 CPU 内核设置的一位.作为大多数具有超线程的 Intel CPU 的示例 GROUP_AFFINITY.Mask 将包含为 L2 缓存设置的 2 位和为具有 4 个物理 CPU 内核和 8 个逻辑 CPU 的 CPU 设置的 8 位为 L3 缓存核心.

The number of logical CPU cores sharing a cache can be obtained using GetLogicalProcessorInformationEx and the CACHE_RELATIONSHIP and GROUP_AFFINITY data structures. The GROUP_AFFINITY.Mask value contains one bit set for each CPU core that shares the current cache (RelationCache). As an example for most Intel CPUs with hyper-threading GROUP_AFFINITY.Mask will contain 2 bits set for the L2 cache and 8 bits set for the L3 cache for a CPU with 4 physical CPU cores and 8 logical CPU cores.

这是 C++ 代码:

#include <windows.h>
#include <vector>
#include <iostream>

using namespace std;

typedef BOOL (WINAPI *LPFN_GLPI)(LOGICAL_PROCESSOR_RELATIONSHIP,
    PSYSTEM_LOGICAL_PROCESSOR_INFORMATION_EX, PDWORD);

int main()
{
    LPFN_GLPI glpi = (LPFN_GLPI) GetProcAddress(
        GetModuleHandle(TEXT("kernel32")), "GetLogicalProcessorInformationEx");

    if (!glpi)
        return 1;

    DWORD bytes = 0;
    glpi(RelationAll, 0, &bytes);
    vector<char> buffer(bytes);
    SYSTEM_LOGICAL_PROCESSOR_INFORMATION_EX* info;

    if (!glpi(RelationAll, (SYSTEM_LOGICAL_PROCESSOR_INFORMATION_EX*) &buffer[0], &bytes))
        return 1;

    for (size_t i = 0; i < bytes; i += info->Size)
    {
        info = (SYSTEM_LOGICAL_PROCESSOR_INFORMATION_EX*) &buffer[i];

        if (info->Relationship == RelationCache &&
            (info->Cache.Type == CacheData ||
             info->Cache.Type == CacheUnified))
        {
            cout << "info->Cache.Level: " << (int) info->Cache.Level << endl;
            cout << "info->Cache.CacheSize: " << (int) info->Cache.CacheSize << endl;
            cout << "info->Cache.GroupMask.Group: " << info->Cache.GroupMask.Group << endl;
            cout << "info->Cache.GroupMask.Mask: " << info->Cache.GroupMask.Mask << endl << endl;
        }
    }

    return 0;
}

注意事项:

我发现在虚拟机中运行 Windows 时，上面的代码无法正确检测共享缓存的 CPU 内核数，例如在具有 2 个虚拟 CPU 内核的 VM 上，上面的代码报告每个逻辑 CPU 内核都有一个私有的 L1、L2 和 L3 缓存.

I have found that when running Windows inside a virtual machine the code above was unable to correctly detect the number of CPU cores sharing the caches, e.g. on a VM with 2 virtual CPU cores the code above reports that each logical CPU core has a private L1, L2 and L3 cache.

答

@RbMm: 但 CACHE_RELATIONSHIP 包含所有需要的信息.逻辑 CPU 核心数 = Cache->GroupMask.Mask

@RbMm: but CACHE_RELATIONSHIP contains all info needed. number of logical CPU cores = number of bits set in Cache->GroupMask.Mask

我已经在 AppVeyor CI 上测试过这个(甚至在发布到 stackoverflow 之前).这是 x64 CPU 的输出:

I have tested this on AppVeyor CI (even before posting to stackoverflow). Here is the output for an x64 CPU:

info->Cache.Level: 1
info->Cache.CacheSize: 32768
info->Cache.GroupMask.Group: 0
info->Cache.GroupMask.Mask: 1

info->Cache.Level: 1
info->Cache.CacheSize: 32768
info->Cache.GroupMask.Group: 0
info->Cache.GroupMask.Mask: 1

info->Cache.Level: 2
info->Cache.CacheSize: 262144
info->Cache.GroupMask.Group: 0
info->Cache.GroupMask.Mask: 1

info->Cache.Level: 3
info->Cache.CacheSize: 31457280
info->Cache.GroupMask.Group: 0
info->Cache.GroupMask.Mask: 1

info->Cache.Level: 1
info->Cache.CacheSize: 32768
info->Cache.GroupMask.Group: 0
info->Cache.GroupMask.Mask: 2

info->Cache.Level: 1
info->Cache.CacheSize: 32768
info->Cache.GroupMask.Group: 0
info->Cache.GroupMask.Mask: 2

info->Cache.Level: 2
info->Cache.CacheSize: 262144
info->Cache.GroupMask.Group: 0
info->Cache.GroupMask.Mask: 2

info->Cache.Level: 3
info->Cache.CacheSize: 31457280
info->Cache.GroupMask.Group: 0
info->Cache.GroupMask.Mask: 2

或者:

| Cache Level |    Processor 1     |    Processor 2     |
|-------------|--------------------|--------------------|
| L1          |  32 KB Data        |  32 KB Data        |
|             |  32 KB Instruction |  32 KB Instruction |
|-------------|--------------------|--------------------|
| L2          | 256 KB Unified     | 256 KB Unified     |
|-------------|--------------------|--------------------|
| L3          |  30 MB Unified     |  30 MB Unified     |

根据 MSDN 文档:

According to the MSDN documentation:

GroupMask.Mask - 一个位图，用于指定指定组内零个或多个处理器的关联.

GroupMask.Mask - A bitmap that specifies the affinity for zero or more processors within the specified group.

基于此文档，我期望为 L3 缓存使用不同的 GroupMask.Mask，但上面的输出并未显示这一点.对我来说 GroupMask.Mask 中的数据毫无意义！

Based on this documentation I was expecting a different GroupMask.Mask for the L3 cache, but the output above does not show this. To me the data in GroupMask.Mask makes no sense!

这是一个代码a> 产生上面的数据

Here is a link to the code which produces that data above

获取共享缓存的逻辑 CPU 内核数(L1、L2、L3)

相关推荐