我有一个计算,其中我需要遍历 3d numpy 数组的项目并将它们添加到数组第二个维度中的值(跳过该维度中的值)。它类似于这个规范的最小繁殖示例:
import numpy as np
data = np.array([
[[1, 1, 1], [10, 10, 10], [1, 1, 1]],
[[2, 2, 2], [20, 20, 20], [2, 2, 2]],
[[3, 3, 3], [30, 30, 30], [3, 3, 3]] ])
def process_data(const_idx, data, i, j, k):
if const_idx != j:
# PROBLEM: how can I access this value if this function is vectorized?
value_to_add = data[i][const_idx][k]
data[i][j][k] += value_to_add
const_idx = 1
for i in range(data.shape[0]):
for j in range(data.shape[1]):
for k in range(data.shape[2]):
process_data(const_idx, data, i, j, k)
print(data)
本例中的预期输出是:
[[[11 11 11]
[10 10 10]
[11 11 11]]
[[22 22 22]
[20 20 20]
[22 22 22]]
[[33 33 33]
[30 30 30]
[33 33 33]]]
上面的代码可以工作,但是对于大型数组来说非常慢。我想向量化这个函数。
我的第一次尝试是这样的:
def process_data(val, data, const_idx):
# PROBLEM: How can I access this value given that I do not have access to the i / j / k coordinates val came from?
value_to_add = ...
# PROBLEM: I cannot make this check either since I dont know the j index being processed here
if const_idx != j:
return val + value_to_add
else:
return val
vfunc = np.vectorize(process_data)
result = vfunc(data, data, const_idx)
print(result)
我怎样才能做到这一点,或者矢量化不是答案?
const_idx
指向充当加法因子的行索引。def add_by_idx(arr, idx):
r = np.arange(arr.shape[1]) # row indices
arr[:, r[r != idx], :] += arr[:, [idx], :]
add_by_idx(data, 1)
print(data)
[[[11 11 11]
[10 10 10]
[11 11 11]]
[[22 22 22]
[20 20 20]
[22 22 22]]
[[33 33 33]
[30 30 30]
[33 33 33]]]