Numba Cuda 中的字符串操作:从字符串数组中截取前 k 个字符,k 来自另一个数组

问题描述 投票:0回答:1

我们有两个数组

arr1
(有字符串元素)和
arr2
(有整数)。 我想从
arr2[i]
中剪辑第一个
arr[i]
字符。这些数组非常大,所以我想在
Numba
cuda 中实现它。 Pythonic实现如下:

arr1 = ['abc', 'def', 'xyz']
arr2 = [1,2,3]

def python_clipper(arr1,arr2):
    for i in range(len(arr1)):
        arr1[i] = arr1[i][arr2[i]:]
    return arr1

print(python_clipper(arr1,arr2)) # ['bc', 'f', '']

上面的实现工作正常。但是当我像这样从这个 python 函数中创建一个

cuda
函数时:

@cuda.jit()
def cuda_clipper(arr1,arr2):
    i = cuda.grid(1)
    arr1[i] = arr1[i][arr2[i]:]

blockspergrid, threadsperblock = len(arr1),1
cuda_clipper[blockspergrid, threadsperblock](arr1,arr2) # ['bc', 'f', '']
print(arr1)

我收到以下错误:

numba.core.errors.TypingError: Failed in cuda mode pipeline (step: nopython frontend)
Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<function _empty_string at 0x7f0456884d30>) found for signature:
 
 >>> _empty_string(int64, int64, bool)
 
There are 2 candidate implementations:
      - Of which 2 did not match due to:
      Overload in function 'register_jitable.<locals>.wrap.<locals>.ov_wrap': File: numba/core/extending.py: Line 159.
        With argument(s): '(int64, int64, bool)':
       Rejected as the implementation raised a specific error:
         NumbaRuntimeError: Failed in nopython mode pipeline (step: native lowering)
       NRT required but not enabled
       During: lowering "s = call $10load_global.3(kind, char_width, length, is_ascii, func=$10load_global.3, args=[Var(kind, unicode.py:276), Var(char_width, unicode.py:276), Var(length, unicode.py:276), Var(is_ascii, unicode.py:276)], kws=(), vararg=None, varkwarg=None, target=None)" at /mnt/local-raid10/workspace/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/cpython/unicode.py (277)
  raised from /mnt/local-raid10/workspace/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/core/runtime/context.py:19

During: resolving callee type: Function(<function _empty_string at 0x7f0456884d30>)
During: typing of call at /mnt/local-raid10/workspace/user/anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/cpython/unicode.py (1700)


File "../../anaconda3/envs/condaenv/lib/python3.9/site-packages/numba/cpython/unicode.py", line 1700:
            def getitem_slice(s, idx):
                <source elided>
                    # It's heterogeneous in kind OR stride != 1
                    ret = _empty_string(kind, span, is_ascii)
                    ^

During: typing of intrinsic-call at /mnt/local-raid10/workspace/user/trim/trim_new_implementation/string_numba.py (143)

File "string_numba.py", line 143:
def cuda_clipper(arr1,arr2):
    <source elided>
    i = cuda.grid(1)
    arr1[i] = arr1[i][arr2[i]:]
    ^

我的印象是切片字符串是问题所在,因为类似的实现可以很好地处理数组。我试图将

arr1
变成数组的数组,但是预处理本身需要一些时间来渲染
cuda
无用以提高性能。我如何才能在
str
内直接与
numba
合作,而不是想着规避问题。

python cuda nvidia numba jit
1个回答
0
投票

我们有两个数组

arr1
(有字符串元素)和
arr2
(有整数)

你没有数组。你有清单。从文档可以看出,GPU上没有python字符串或列表支持。

因此,Numba CUDA 目前不支持您尝试执行的操作。

© www.soinside.com 2019 - 2024. All rights reserved.