如何在python中不用for循环高效计算欧氏距离矩阵?

问题描述 投票:1回答:1

我有一个(51266,20,25,3)(N,F,J,C)矩阵,其中N是实例号,F是帧号,J是关节,C是关节的xyz坐标。我想计算每个例子中每一帧的欧氏距离矩阵,使其有一个维数为(51266,20,25,25)的矩阵 我的代码为

from sklearn.metrics.pairwise import euclidean_distances as euc
from tqdm import tqdm
import numpy as np
Examples = np.load('allExamples.npy')
theEuclideanMethod = np.zeros((0,20,25,25))
for example in tqdm(range(Examples.shape[0])):
  euclideanBox = np.zeros((0,25,25))
  for frame in range(20):
    euclideanBox = np.concatenate((euclideanBox,euc(Examples[example,frame,:,:])[np.newaxis,...]),axis=0)

  euclideanBox = euclideanBox[np.newaxis,...]
  theEuclideanMethod = np.concatenate((theEuclideanMethod,euclideanBox))

np.save("Euclidean examples.npy",theEuclideanMethod)
print(theEuclideanMethod.shape,"Euclidean shape")  

问题是我用的是for循环,超级慢。有什么其他方法可以让我的代码运行得更快?

python numpy
1个回答
0
投票

这个应该跑得挺快的。用Float32来保持内存使用量低,但是是可选的。调整 batch_size 大一点以提高速度,小一点以减少内存使用。

import numpy as np

# Adjust batch_size depending on your memory
batch_size = 500

# Make some fake data
x = np.random.randn(51266,20,25,3).astype(np.float32)
y = np.random.randn(51266,20,25,3).astype(np.float32)

# distance_matrix
d = np.empty(x.shape[:-1] + (x.shape[-2],), dtype=np.float32)
# Number of batches
N = (x.shape[0]-1) // batch_size + 1
for i in range(N):
    d[i*batch_size:(i+1)*batch_size] = np.sqrt(np.sum((
        x[i*batch_size:(i+1)*batch_size,:,:,None] - \
        y[i*batch_size:(i+1)*batch_size,:,None,:])**2, axis=-1))

0
投票

你可以使用数组广播,像这样。

import numpy as np

examples = np.random.uniform(size=(5, 6, 7, 3))
N, F, J, C = examples.shape

# deltas.shape == (N, F, J, J, C) - Cartesian deltas
deltas  = examples.reshape(N, F, J, 1, C) - examples.reshape(N, F, 1, J, C)

# distances.shape == (N, F, J, J)
distances = np.sqrt((deltas**2).sum(axis=-1), dtype=np.float32)

del deltas # release memory (only needed for interactive use)

这有点耗费内存: 在你提到的N, F, J, C的情况下,中间结果(deltas)将需要16GB,假设双精度。如果你用单精度预分配输出数组,并在N轴上循环,效率会更高(内存减少6倍,更好地使用缓存)。

distances = np.empty((N, F, J, J))

for i, ex in enumerate(examples):
    # deltas.shape = (F, J, J, C) - Cartesian deltas
    deltas = ex.reshape(F, J, 1, C) - ex.reshape(F, 1, J, C)
    distances[i] = np.sqrt((deltas**2).sum(axis=-1))
© www.soinside.com 2019 - 2024. All rights reserved.