这是我的代码的一部分。我在 python 和 cython 中都试过了。在这种情况下,Cython 快 2 秒(仅当提到返回类型时。否则,它比 python 代码慢近 3.5 秒)。有没有机会让它更快。任何帮助/讨论将不胜感激。谢谢。
%%cython
# %%cython --compile-args=-fopenmp --link-args=-fopenmp --force
cimport cython
cimport numpy as cnp
import numpy as np
from cython.parallel import parallel, prange
ctypedef cnp.int_t DTYPE
@cython.boundscheck(False)
@cython.cdivision(True)
@cython.wraparound(False)
@cython.nogil
@cython.cfunc
@cython.exceptval(-1)
@cython.returns(list )
cdef list sub_mat_extract ( cnp.ndarray[ DTYPE , ndim= 3] mat , cython.int neibors) :
# print('sub_mat_extract: ', np.shape(mat) )
# temp = []
cdef:
Py_ssize_t M = 0, N = 0, x =0
Py_ssize_t i
Py_ssize_t j
Py_ssize_t row = np.shape(mat)[0]
Py_ssize_t col = np.shape(mat)[1]
list temp = []
list temp1 = []
list dup1 = []
list dup2 = []
for i in range( ((neibors-1)/2) , row - ((neibors-1)/2) ):
N = 0
temp1 = []
for j in range( col ):
temp1.extend(mat[ j + M ][ 0 + N : neibors + N])
# print(i,M, mat[i+M][0+N :3+N])
# print(temp1)
if j + M == neibors + M-1:
M = M + 1
break
temp.append(temp1)
N += 1
if M == col:
break
dup1 = []
for i in range(len(temp) ):
x = 0
while (x <= col - neibors):
dup2 = []
for j in range(len(temp[i])):
# print([temp[i][j][0], temp[i][j][1]+x] )
dup2.append([temp[i][j][0], temp[i][j][1]+x] )
dup1.append(dup2)
x = x+1
return (dup1)
def action(mat, neibor):
return (sub_mat_extract(np.array(mat), neibor ))
python版本时间:
CPU times: total: 5.23 s
Wall time: 5.77 s
同样适用于 cython:
CPU times: total: 3.14 s
Wall time: 4.78 s
我正在尝试将我所有的代码从传统的 python 转换为 cython。我想看看是否在所有情况下,cython 都能比 python 更快。我的最终目标是了解代码的运行速度(利用硬件(numba+多进程)和类似 python 的编译器)。我只在 jupyter notebook 中运行代码。