Python Numba Cuda复制到主机缓慢

问题描述 投票:0回答:1

我最近开始研究使用cuda优化数字数组的搜索。我下面有一段简化的代码来演示该问题。

import numpy as np
import time
from numba import cuda


@cuda.jit
def count_array4(device_array,  pivot_point, device_output_array):
    for i in range(len(device_array)):
        if (pivot_point - 0.05) < device_array[i] < (pivot_point + 0.05):
            device_output_array[i] = True
        else:
            device_output_array[i] = False


width = 512
height = 512
size = width * height
print(f'Number of records {size}')
array_of_random = np.random.rand(size)
device_array = cuda.to_device(array_of_random)

start = time.perf_counter()
device_output_array = cuda.device_array(size)
print(f'Copy Host to Device: {time.perf_counter() - start}')

for x in range(10):
    start = time.perf_counter()
    count_array4[512, 512](device_array,  .5, device_output_array)
    print(f'Run: {x} Time: {time.perf_counter() - start}')

start = time.perf_counter()
output_array = device_output_array.copy_to_host()
print(f'Copy Device to Host: {time.perf_counter() - start}')

print(np.sum(output_array))

这给了我预期的处理优化,但是将数据返回给主机所花费的时间似乎非常长。

Number of records 262144
Copy Host to Device: 0.00031610000000004135
Run: 0 Time: 0.0958601
Run: 1 Time: 0.0001626999999999601
Run: 2 Time: 0.00012100000000003774
Run: 3 Time: 0.00011590000000005762
Run: 4 Time: 0.00011419999999995323
Run: 5 Time: 0.0001126999999999656
Run: 6 Time: 0.00011289999999997136
Run: 7 Time: 0.0001122999999999541
Run: 8 Time: 0.00011490000000002887
Run: 9 Time: 0.00011200000000000099
Copy Device to Host: 13.0583358
26110.0

我相当确定我在这里遗漏了一些基本的东西,或者我不知道要搜索的正确术语的技术。如果有人能指出正确的方向,我将不胜感激。

python memory-management cuda gpu-programming
1个回答
0
投票

内核启动是异步的,驱动程序可以将多个启动排队。结果,您仅测量循环中的内核启动开销,然后作为阻塞调用的数据传输将捕获所有内核执行时间。您可以通过如下修改代码来更改此行为:

for x in range(10):
    start = time.perf_counter()
    count_array4[512, 512](device_array,  .5, device_output_array)
    cuda.synchronize()
    print(f'Run: {x} Time: {time.perf_counter() - start}')

同步调用可确保每个内核启动完成,并且设备在另一个内核启动之前处于空闲状态。结果应该是每个内核运行时间都会增加,指示的传输时间将会减少。

© www.soinside.com 2019 - 2024. All rights reserved.