我有两个数组 A、B,它们的形状均为 (42, 28, 4),其中:
42 : y_dim
28 : x_dim
4 : RGBA
## I'm on MacBook Air M1 2020 16Gb btw
我想通过与此类似的过程将它们组合起来:
def add(A, B):
X = A.shape[1]
Y = A.shape[0]
alpha = A[..., 3] / 255
B[..., :3] = blend(B[..., :3], A[..., :3], alpha.reshape(Y, X, 1))
return B
def blend(c1, c2, alpha):
return np.asarray((c1 + np.multiply(c2, alpha))/(np.ones(alpha.shape) + alpha), dtype='uint8')
但目前这对我来说有点太慢了(大约 20 毫秒,250 个图像叠加在基本数组 [1] 之上),如果你有任何方法来改进它(最好有 8 位 alpha 支持),我会很高兴知道。
[1]:
start = time.time()
for obj in l: # len(l) == 250
_slice = np.index_exp[obj.y * 42:(obj.y+1) * 42, obj.x * 28 : (obj.x+1) * 28, :]
self.pixels[_slice] = add(obj.array, self.pixels[_slice])
stop = time.time()
>>> stop - start # ~20ms
我已经半尝试过以下方法:
# cv2.addWeighted() in add()
## doesn't work because it has one alpha for the whole image,
## but I want to have indiviual alpha control for each pixel
B = cv.addWeighted(A, 0.5, B, 0.5, 0)
# np.vectorize blend() and use in add()
## way too slow because as the docs mention it's basically just a for-loop
B[..., :3] = np.vectorize(blend)(A[..., :3], B[..., :3], A[..., 3] / 255)
# changed blend() to the following
def blend(a, b, alpha):
if alpha == 0:
return b
elif alpha == 1:
return a
return (b + a * alpha) / (1 + alpha)
# moved the blend()-stuff to add()
## doesn't combine properly; too dark with alpha
np.multiply(A, alpha.reshape(Y, X, 1)) + np.multiply(B, 1 - alpha.reshape(Y, X, 1))
我也尝试过一些按位的东西,但我的猴脑无法正确理解它。我使用的是 M1 Mac,因此如果您有过 Metalcompute 和 Python 的经验,请写下对此的任何想法!
欢迎任何意见,提前致谢!
这里是 numba 版本,它在我的计算机(AMD 5700x)上比原始版本快约 2 倍(我没有 M1,所以你的结果可能会有所不同):
@njit
def add_numba(A, B):
alpha = A[..., 3] / 255
for i in range(A.shape[0]):
for j in range(A.shape[1]):
B[i, j, :3] = (B[i, j, :3] + A[i, j, :3] * alpha[i, j]) / (1 + alpha[i, j])
return B
基准:
from statistics import median
from timeit import repeat
import numpy as np
from numba import njit
@njit
def add_numba(A, B):
alpha = A[..., 3] / 255
for i in range(A.shape[0]):
for j in range(A.shape[1]):
B[i, j, :3] = (B[i, j, :3] + A[i, j, :3] * alpha[i, j]) / (1 + alpha[i, j])
return B
def setup_A_B():
A = np.random.randint(0, 255, size=(42, 28, 4), dtype="uint8")
B = np.random.randint(0, 255, size=(42, 28, 4), dtype="uint8")
return A, B
def add(A, B):
X = A.shape[1]
Y = A.shape[0]
alpha = A[..., 3] / 255
B[..., :3] = blend(B[..., :3], A[..., :3], alpha.reshape(Y, X, 1))
return B
def blend(c1, c2, alpha):
return np.asarray(
(c1 + np.multiply(c2, alpha)) / (np.ones(alpha.shape) + alpha), dtype="uint8"
)
# assert the result is equal
np.random.seed(42)
A1, B1 = setup_A_B()
A2, B2 = A1.copy(), B1.copy()
assert np.allclose(add(A1, B1), add_numba(A2, B2))
repeats_normal = repeat(
"add(A, B)", setup="A, B = setup_A_B()", globals=globals(), repeat=10, number=2500
)
repeats_numba = repeat(
"add_numba(A, B)",
setup="A, B = setup_A_B()",
globals=globals(),
repeat=10,
number=2500,
)
print(f"2500 calls (original) = {median(repeats_normal):.4f}")
print(f"2500 calls (numba) = {median(repeats_numba):.4f}")
打印:
2500 calls (original) = 0.1501
2500 calls (numba) = 0.0742