Vulkan Memory（一）【Basic】

Memory 简介
- Memory 属性
- 申请内存
- 内存选择
- Suballocation
- MapMemory
- Dedicated Allocation
Reference

Memory 简介

Vulkan 对于资源和内存是分开管理的，对于应用来说由更高的自由度管理内存，包括内存池管理、内存复用等，也带来额外的问题：

应用需要考虑内存类型与 Heap 选择
应用需要考虑申请 Memory 的数量限制，最低为 4096 (VkPhysicalDeviceLimits::maxMemoryAllocationCount)

Memory 属性

通过 VkPhysicalDeviceMemoryProperties 可以获取当前 PhysicalDevice 的内存属性：

typedef struct VkPhysicalDeviceMemoryProperties {
    uint32_t        memoryTypeCount;
    VkMemoryType    memoryTypes[VK_MAX_MEMORY_TYPES];
    uint32_t        memoryHeapCount;
    VkMemoryHeap    memoryHeaps[VK_MAX_MEMORY_HEAPS];
} VkPhysicalDeviceMemoryProperties;

对于各个 MEMORY_PROPERTY_BIT ：

DEVICE_LOCAL：device 访问效率最高，置位条件当且仅当指向的 Heap flags 包含 HEAP_DEVICE_LOCAL
HOST_VISIBLE：可以通过 vkMapMemory map 并且 host 可见
HOST_COHERENT：host 写 device 读， device 写 host 读行为无需经过 vkFlushMappedMemoryRanges 和 vkInvalidateMappedMemoryRanges
HOST_CACHED：在 host 端 cached，host 访问 cached memory 有更高的性能
LAZILY_ALLOCATED：仅 device 可见，与 HOST_VISIBLE 互斥。即对于实际资源绑定的内存，可能为 0 ，需求大小或者根据实际使用时单调递增，取决于具体驱动实现，多用于 framebuffer attachments
PROTECTED：仅 device 可见，与 HOST_XXX 互斥，允许 protected queue 对内存操作

无扩展的情况下类型组合见下表，具体设备支持情况通过 VkPhysicalDeviceMemoryProperties::memoryTypes 数组内容返回。

	DEVICE_LOCAL	HOST_VISIBLE	HOST_COHERENT	HOST_CACHED	LAZILY_ALLOCATED	PROTECTED
0
1		✓	✓
2		✓		✓
3		✓	✓	✓
4	✓
5	✓	✓	✓
6	✓	✓		✓
7	✓	✓	✓	✓
8	✓				✓
9						✓
10	✓					✓

spec 规定

至少有一个 MemoryType 同时包含 HOST_VISIBLE | HOST_COHERENT
至少由一个 MemoryType 包含 DEVICE_LOCAL

同时返回值 memoryTypes 数组已排序，排序规则对于 X 和 Y 两个 MemoryType, X < Y 需要满足：

X 的 propertyFlags 是 Y 的严格子集
X 和 Y 的 propertyFlags 相等的情况下，X 指向性能更高的 Heap

此条件保证应用单次遍历可以找到最优的符合条件的内存

申请内存

Vulkan 内存申请需要以下步骤（不考虑 sparse resources），如下图中所示

创建资源 (Buffer \ Image)
通过 vkGetImageMemoryRequirements 或者 vkGetBufferMemoryRequirements 获取 VkMemoryRequirements
通过 VkMemoryRequirements 从 VkPhysicalDeviceMemoryProperties::memoryTypes 获取最佳 memoryTypeIndex
通过 vkAllocateMemory 申请内存

搜索 memoryTypeIndex 可以借鉴以下函数：

int32_t FindProperties(const VkPhysicalDeviceMemoryProperties* properties, uint32_t memoryTypeBits, VkMemoryPropertyFlags requiredProperties)
{
    const uint32_t memoryCount = properties->memoryTypeCount;
    for (uint32_t i = 0; i < memoryCount; ++i) {
        const bool isRequiredMemoryType  = memoryTypeBits & (1 << i);
        const bool hasRequiredProperties = (properties->memoryTypes[i].propertyFlags & requiredProperties) == requiredProperties;
        if (isRequiredMemoryType && hasRequiredProperties)
            return static_cast<int32_t>(i);
    }
    return -1;
}

内存选择

memoryTypes 设备提供的了有限的组合，其中考虑用途常用组合为：

DEVICE_LOCAL：用于 GPU 频繁访问，CPU 单次或者较低频率 upload 的场景，常见：
- FrameBuffer attachments
- 静态 Mesh VB \ IB
- 静态 Image
DEVICE_LOCAL | HOST_VISIBLE：用于 CPU 写，GPU 访问场景，可作为 DEVICE_LOCAL fallback，常见：
- 动态 Mesh VB \ IB
- uniform buffers
HOST_VISIBLE | HOST_COHERENT：同 CPU 写，GPU 访问场景，经过 PCIe，可作为 DEVICE_LOCAL | HOST_VISIBLE fallback，常见：
- uniform buffers
- staging buffer transfer
HOST_VISIBLE | HOST_CACHED：GPU 写，CPU 访问的 readback 场景，常见：
- 读取 compute 管线结果

实际应用场景要复杂于上述组合，应用需要考虑 OOM，以及 fallback 场景，此处可以考虑 AMD Vulkan Memory Allocator，该 lib 将使用场景简化为了以下几种，并进行了内存池管理，后续补充对 vma 的分析。

CPU_ONLY
GPU_ONLY
CPU_TO_GPU
GPU_TO_CPU
CPU_COPY
GPU_LAZILY_ALLOCATED

Suballocation

需要考虑 Suballocation 原因：

Vulkan Memory 的申请数量受 VkPhysicalDeviceLimits::maxMemoryAllocationCount 限制，并且数量最低仅保证 4096 个。典型场景如 PerObject UBO 按照 Object 粒度申请，则很轻易地会耗尽，产生未定义行为。
内存对齐对性能产生的影响。
避免运行时申请、释放内存。

为此需要考虑预先申请 Memory Blocks，并通过 Suballocation 自行分配，Block 大小推荐 256M。此外内存对齐的几条建议规则：

Image 资源由 max( VkPhysicalDeviceLimits::bufferImageGranularity, VkMemoryRequirements::alignment ) 进行地址、大小对齐
跟踪内存分配，在前后资源类型不一致时，Allocator 添加必要的 padding
Image 和 Buffer 由不同的内存池管理，较小的 padding 可以减少内存碎片，但是当 Block 大小较大时同样会产生一定浪费

MapMemory

HOST_VISIBLE 内存可以通过 vkMapMemory 获得一个 host 虚拟地址指针，应用可以保留 mapped 指针，有两点优势：

减少 map \ unmap 的开销
优化多个资源对同一个 memory 对象写操作时的处理

例外场景：
AMD GPU && Windows < 10 平台，保留 DEVICE_LOCAL + HOST_VISIBLE 内存 mapped 指针，可能会导致内存迁移至系统内存

Dedicated Allocation

设备的通用内存要求支持Sub Allocation、Memory Aliasing 以及 Sparse Binding，而通用性可能会干扰特殊场景的优化。因此设备可能会提供专用内存，以在特定场景下有更好的访问性能。

Dedicated Allocation 需要开启 Device Extension VK_KHR_dedicated_allocation，配合以下几个数据结构：

VkMemoryDedicatedRequirements
VkMemoryDedicatedAllocateInfo

查询是否支持

VkMemoryDedicatedRequirements memDedicatedReq = {};
memDedicatedReq.sType = VK_STRUCTURE_TYPE_MEMORY_DEDICATED_REQUIREMENTS;

VkMemoryRequirements2 memoryReqs2 = {};
memoryReqs2.sType = VK_STRUCTURE_TYPE_MEMORY_REQUIREMENTS_2;
memoryReqs2.pNext = &memDedicatedReq;
vkGetImageMemoryRequirements2(vkDevice, &memoryReqsInfo, &memoryReqs2);

查询结果 VkMemoryDedicatedRequirements ，其中：

prefersDedicatedAllocation：建议使用 Dedicated Allocation 以获得更高的性能。
requiresDedicatedAllocation：必须使用 Dedicated Allocation。

申请内存，在 prefersDedicatedAllocation 为 TRUE 的情况下

VkMemoryDedicatedAllocateInfo dedicatedInfo = {};
dedicatedInfo.sType = VK_STRUCTURE_TYPE_MEMORY_DEDICATED_ALLOCATE_INFO;
dedicatedInfo.image = image;

VkMemoryAllocateInfo memoryAllocateInfo = {};
memoryAllocateInfo.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;
memoryAllocateInfo.pNext = memDedicatedReq.prefersDedicatedAllocation ? &dedicatedInfo : nullptr;
memoryAllocateInfo.allocationSize = memoryReqs2.memoryRequirements.size;
memoryAllocateInfo.memoryTypeIndex = FindProperties(&phyMemProps, memoryReqs2.memoryRequirements.memoryTypeBits, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT);
VkDeviceMemory memory = VK_NULL_HANDLE;
vkAllocateMemory(vkDevice, &memoryAllocateInfo, nullptr, &memory);

Reference

[1] Arseny Kapoulkine. “Writing an Efficient Vulkan Renderer”. GPU Zen 2
[2] “Memory Management in Vulkan™ and DX12”. GDC 2018

更多推荐

【Vulkan】Memory（一）【Basic】

【Vulkan】Memory（一）【Basic】

Vulkan Memory（一）【Basic】

Memory 简介

Memory 属性

申请内存

内存选择

Suballocation

MapMemory

Dedicated Allocation

Reference

发布评论取消回复

最近发表

热门文章

标签列表

	DEVICE_LOCAL	HOST_VISIBLE	HOST_COHERENT	HOST_CACHED	LAZILY_ALLOCATED	PROTECTED
0
1		✓	✓
2		✓		✓
3		✓	✓	✓
4	✓
5	✓	✓	✓
6	✓	✓		✓
7	✓	✓	✓	✓
8	✓				✓
9						✓
10	✓					✓

	DEVICE_LOCAL	HOST_VISIBLE	HOST_COHERENT	HOST_CACHED	LAZILY_ALLOCATED	PROTECTED
0
1		✓	✓
2		✓		✓
3		✓	✓	✓
4	✓
5	✓	✓	✓
6	✓	✓		✓
7	✓	✓	✓	✓
8	✓				✓
9						✓
10	✓					✓

【Vulkan】Memory（一）【Basic】

Vulkan Memory（一）【Basic】

Memory 简介

Memory 属性

申请内存

内存选择

Suballocation

MapMemory

Dedicated Allocation

Reference

相关文章

发布评论取消回复

最近发表

热门文章

标签列表

	DEVICE_LOCAL	HOST_VISIBLE	HOST_COHERENT	HOST_CACHED	LAZILY_ALLOCATED	PROTECTED
0
1		✓	✓
2		✓		✓
3		✓	✓	✓
4	✓
5	✓	✓	✓
6	✓	✓		✓
7	✓	✓	✓	✓
8	✓				✓
9						✓
10	✓					✓