在GPU上分配统一内存

Question

我正在使用 Python 和 cuPy 访问 GPU 集群（NVIDIA V100，四个 GPU，每个 32 GB）。我需要分配单个 GPU 无法容纳的大型数组。因此，我想使用 cuPy 来使用统一内存，它可以分配在所有可用的 GPU 上。

我的代码：

import cupy as cp
import numpy as np

# allocate unified memory
pool = cp.cuda.MemoryPool(cp.cuda.malloc_managed)
cp.cuda.set_allocator(pool.malloc)

# Desired memory in GB
desired_memory_gb = 42 # SET this value to greater than 32 GB

# Calculate the number of elements required to achieve desired memory
element_size_bytes = np.dtype(np.float64).itemsize
desired_memory_bytes = desired_memory_gb * (1024**3)  # Convert GB to bytes
num_elements = desired_memory_bytes // element_size_bytes

# Create the array with the calculated number of elements
array = cp.full(num_elements, 1.1, dtype=np.float64)

# GPU
print("Array allocated on unified memory...")

问题：通过上面的代码，我能够分配统一内存，也能够拥有更大的数组（大于32 GB），但统一内存并未在所有GPU上分配。在这种情况下，32 GB 内存分配在一个 GPU 上，其余内存分配在 CPU 上（而不是在 GPU-2、3 或 4 上）。

如何强制程序使用所有GPU（而不仅仅是CPU）来分配统一内存？

Answer 1

您可以为每个GPU使用cp.cuda.set_allocator。

import cupy as cp
import numpy as np

# Desired memory in GB
desired_memory_gb = 42  # SET this value to greater than 32 GB

# Calculate the number of elements required to achieve desired memory
element_size_bytes = np.dtype(np.float64).itemsize
desired_memory_bytes = desired_memory_gb * (1024**3)  # Convert GB to bytes
num_elements = desired_memory_bytes // element_size_bytes

# Create an array on each GPU
arrays = [cp.full(num_elements, 1.1, dtype=np.float64, device=device) for device in cp.cuda.Device.available_gpus()]

# Print information
for i, array in enumerate(arrays):
    print(f"Array allocated on GPU-{i + 1} with shape {array.shape}")

在GPU上分配统一内存

问题描述投票：0回答：1

1个回答

最新问题

在GPU上分配统一内存

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1