可以始终依赖ctypes.data_as来保留对临时对象的引用吗?

问题描述 投票:2回答:2

[将数组从python传递到后端c++库时,是否可以依靠以下内容?这曾经在python <= 3.6中起作用,但似乎导致[[零星在python >= 3.7中崩溃:

(这是“真实”代码的简化版本,其中面向用户的python接口在基础c++ lib之间来回传递数据)

# a 2d array, possibly not order="F" xmat = np.ones((16, 32), dtype=np.float64) # get a pointer to a version of xmat that is guaranteed to have order="F" # if xmat already has order="F": no temporary # if not, a temporary copy is made, reordered and a ptr to that returned xptr = np.asfortranarray(xmat).ctypes.data_as(ctypes.POINTER(ctypes.c_double)) # pass xptr to c++ back-end to do things (expects order="F" data)

正如我(目前!)所理解的ctypes.data_as should

返回转换为特定c类型对象的数据指针...

返回的指针将保留对该数组的引用。

[另外的示例显示在创建临时文件的情况下,例如(a + b).ctypes.data_as(ctypes.c_void_p),使用data_as是正确的做法。

python >= 3.7中,似乎data_as

not保留对临时的引用,并且在上面,xptr最终指向释放的内存...

我做错什么了吗?这是python >= 3.7中的错误吗?有更好的方法吗?


这里有一个完整的示例(带有一些将后端库的array编组为struct的附加样板:]

import numpy as np import ctypes as ct lib_REALS_t = ct.c_double lib_INDEX_t = ct.c_int32 lib_REALS_p = ct.POINTER(lib_REALS_t) class lib_REALS_array_t(ct.Structure): _fields_ = [("size", lib_INDEX_t), ("data", lib_REALS_p)] class lib_t(ct.Structure): _fields_ = [ ("value", lib_REALS_array_t)] def bug(): libt = lib_t() # a 2d array, user-specified, possibly not order="F" xmat = np.ones((16, 32), dtype=np.float64, order="C") # get a pointer to a version of xmat that is guaranteed to have order="F" # if xmat already has order="F": no temporary # if not, a temporary copy is made, reordered and a ptr to that returned libt.value.size = xmat.size libt.value.data = np.asfortranarray(xmat).ctypes.data_as(ct.POINTER(lib_REALS_t)) # pass xptr to c++ back-end to do things (expects order="F" data) # just "simulate" this by trying to access data using the pointer print(libt.value.data[1]) return if (__name__ == "__main__"): bug()

对我来说,python <= 3.6打印1.0(按预期),而python >= 3.7打印6.92213454250094e-310(即临时文件必须已被释放,因此指向未初始化的内存)。
python numpy ctypes
2个回答
0
投票
经过一段时间阅读ctypes并查找任何重大更改之后,我仅通过添加代理变量就能解决此问题。只要粘贴您的示例代码,我就可以轻松地重现该问题。

我不太清楚为什么会发生这种情况,但是我可以猜测直接将指针分配给另一个指针在ctypes中是错误的。我也会寻找其他可能性。但是目前,您可以通过添加如下所示的代理变量来解决它:

def bug(): libt = lib_t() # a 2d array, user-specified, possibly not order="F" xmat = np.ones((16, 32), dtype=np.float64, order="C") # get a pointer to a version of xmat that is guaranteed to have order="F" # if xmat already has order="F": no temporary # if not, a temporary copy is made, reordered and a ptr to that returned libt.value.size = xmat.size temp_p = np.asfortranarray(xmat).ctypes.data_as(ct.POINTER(lib_REALS_t)) libt.value.data = temp_p # pass xptr to c++ back-end to do things (expects order="F" data) # just "simulate" this by trying to access data using the pointer print(libt.value.data[1]) return


0
投票
列出[Python 3.Docs]: ctypes - A foreign function library for Python

经过调查并寻求编码后,我得出了一个结论(我从一开始就直觉发生了什么。)>

似乎[SciPy.Docs]: numpy.ndarray.ctypes

_ ctypes.data_as

self,obj...

返回的指针将保留对该数组的引用。

具有误导性。保留

reference

表示它将保留数组缓冲区地址(就其意义而言,它不会复制内存),而not保留Python引用(Py_XINCREF) 。看着[Github]: numpy/numpy - numpy/numpy/core/_internal.py

def data_as(self, obj): # Comments return self._ctypes.cast(self._data, obj)
这是对[[ctypes.cast
的调用,它仅保存源数组的地址。

发生的情况是np.asfortranarray(xmat)动态创建了一个临时数组,然后ctypes.data_as

返回其缓冲区地址。在该行之后,临时变量超出范围(其内部缓冲区也是如此),但仍引用其地址,从而产生

Undefined Behavior(UB)。

v1.15.0

[SciPy.Docs]: numpy.ndarray.ctypes强调是我的))中提到:请小心使用ctypes属性-尤其是在临时数组或动态构造的数组上。例如,调用(a+b).ctypes.data_as(ctypes.c_void_p)
返回无效的内存指针,因为创建为(a + b)的数组在下一个Python语句之前被释放]

。您可以使用c=a+bct=(a+b).ctypes来避免此问题。在后一种情况下,ct将保留对该数组的引用,直到删除或重新分配ct。

但是他们后来把它拿出来了。要克服该错误,请“保存”临时数组或

保留(

Python

)引用]。在[SO]: Access violation when trying to read out object created in Python passed to std::vector on C++ side and then returned to Python (@CristiFati's answer)中遇到了相同的问题。我稍微更改了您的代码(包括那些可怕的名称:))。

code00.py

#!/usr/bin/env python3 import sys import ctypes as ct import numpy as np from collections import defaultdict DblPtr = ct.POINTER(ct.c_double) class Struct0(ct.Structure): _fields_ = [ ("size", ct.c_uint32), ("data", DblPtr), ] class Wrapper(ct.Structure): _fields_ = [ ("value", Struct0), ] def test_np(np_array, save_intermediary_array): wrapper = Wrapper() wrapper.value.size = np_array.size if save_intermediary_array: fortran_array = np.asfortranarray(np_array) wrapper.value.data = fortran_array.ctypes.data_as(DblPtr) else: wrapper.value.data = np.asfortranarray(np_array).ctypes.data_as(DblPtr) #print(wrapper.value.data[0]) return wrapper.value.data[1] def main(*argv): dim1, dim0 = 16, 32 mat = np.ones((dim1, dim0), dtype=np.float64, order="C") print("NumPy CTypes data: {0:}\n{1:}".format(mat.ctypes, mat.ctypes._ctypes)) dd = defaultdict(int) flag = 0 # Change to 1 to avoid problem print("Saving intermediary array: {0:d}".format(flag)) for i in range(100): dd[test_np(mat, flag)] += 1 print("\nResult: {0:}".format(dd)) if __name__ == "__main__": print("Python {0:s} {1:d}bit on {2:s}\n".format(" ".join(item.strip() for item in sys.version.split("\n")), 64 if sys.maxsize > 0x100000000 else 32, sys.platform)) print("NumPy version: {0:}".format(np.version.version)) main(*sys.argv[1:]) print("\nDone.")

输出

e:\Work\Dev\StackOverflow\q059959608>sopr.bat *** Set shorter prompt to better fit when pasted in StackOverflow (or other) pages *** [prompt]> "e:\Work\Dev\VEnvs\py_pc064_03.07.06_test0\Scripts\python.exe" code01.py Python 3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)] 64bit on win32 NumPy version: 1.18.0 NumPy CTypes data: <numpy.core._internal._ctypes object at 0x000001C9744B0348> <module 'ctypes' from 'c:\\Install\\pc064\\Python\\Python\\03.07.06\\Lib\\ctypes\\__init__.py'> Saving intermediary array: 0 Result: defaultdict(<class 'int'>, {9.707134377684e-312: 100}) Done. [prompt]> "e:\Work\Dev\VEnvs\py_pc064_03.07.06_test0\Scripts\python.exe" code01.py Python 3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)] 64bit on win32 NumPy version: 1.18.0 NumPy CTypes data: <numpy.core._internal._ctypes object at 0x000001842ECA4FC8> <module 'ctypes' from 'c:\\Install\\pc064\\Python\\Python\\03.07.06\\Lib\\ctypes\\__init__.py'> Saving intermediary array: 0 Result: defaultdict(<class 'int'>, {1.0: 100}) Done. [prompt]> "e:\Work\Dev\VEnvs\py_pc064_03.07.06_test0\Scripts\python.exe" code01.py Python 3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)] 64bit on win32 NumPy version: 1.18.0 NumPy CTypes data: <numpy.core._internal._ctypes object at 0x000001AD586E91C8> <module 'ctypes' from 'c:\\Install\\pc064\\Python\\Python\\03.07.06\\Lib\\ctypes\\__init__.py'> Saving intermediary array: 0 Result: defaultdict(<class 'int'>, {9.110668798574e-312: 100}) Done. [prompt]> "e:\Work\Dev\VEnvs\py_pc064_03.07.06_test0\Scripts\python.exe" code01.py Python 3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)] 64bit on win32 NumPy version: 1.18.0 NumPy CTypes data: <numpy.core._internal._ctypes object at 0x0000012F903A9188> <module 'ctypes' from 'c:\\Install\\pc064\\Python\\Python\\03.07.06\\Lib\\ctypes\\__init__.py'> Saving intermediary array: 0 Result: defaultdict(<class 'int'>, {6.44158096444e-312: 100}) Done.

Notes

所见结果是非常随机的,这是一个

    UB

指标
  • 虽然有趣的是,在同一运行中,它始终是相同的值(defaultdict
  • 只有一项)
  • flag
  • 更改为
  • 1(或任何评估为True的东西将使问题消失)
  • © www.soinside.com 2019 - 2024. All rights reserved.