如标题所示,我需要在不使用循环且仅使用numpy的情况下,计算给定矩阵的所有可能的列向量对之间的欧式距离。
这将产生我正在寻找的输出(但带有循环):
import numpy as np
def all_column_euclidean(x):
output = np.zeros((len(x[0]),len(x[0])))
for i in range(len(x[0])):
for j in range(len(x[0])):
output[i][j] = np.sqrt(np.sum((x[:,i]-x[:,j])**2))
return output
您可以使用numpy的广播功能来完成,这比python循环要快
import numpy as np
x = np.random.rand(10)
y = np.random.rand(10)
# calculate xi-xj for all i,j pair
xi_minus_xj = x - x.reshape(-1,1)
# calculate yi-yj for all i,j pair
yi_minus_yj = y - y.reshape(-1,1)
# calculate (xi-xj)**2 + (yi-yj)**2 for all i, j pair
distances = np.sqrt(xi_minus_xj**2 + yi_minus_yj**2)
# get distance between ith and jth item
print(distances[2,3])
print(distances[2,2])
print(distances[1,8])
S.Vengat是正确的,您将不得不以一种或另一种方式使用循环,但是有一个可以帮助您在1行中完成此操作的库:
import numpy as np
import scipy
data = np.array([[1,2,3],[4,5,6],[7,8,9]])
scipy.spatial.distance.cdist(data.T,data.T)
给出:
array([[0. , 1.73205081, 3.46410162],
[1.73205081, 0. , 1.73205081],
[3.46410162, 1.73205081, 0. ]])
您的比较代码:
import numpy as np
def all_column_euclidean(x):
output = np.zeros((len(x[0]),len(x[0])))
for i in range(len(x[0])):
for j in range(len(x[0])):
output[i][j] = np.sqrt(np.sum((x[:,i]-x[:,j])**2))
return output
data = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(all_column_euclidean(data))
输出:
[[0. 1.73205081 3.46410162]
[1.73205081 0. 1.73205081]
[3.46410162 1.73205081 0. ]]
scipy.spatial.distance
中有相应的功能:
import numpy as np
from scipy.spatial.distance import pdist,squareform
a = np.random.randint(0,10,(3,4))
# pairwise dist, compressed
pdist(a.T)
# array([ 8.60232527, 8.77496439, 10.29563014, 6.70820393, 8.1240384 ,
# 3. ])
# same expanded to full table
squareform(pdist(a.T))
# array([[ 0. , 8.60232527, 8.77496439, 10.29563014],
# [ 8.60232527, 0. , 6.70820393, 8.1240384 ],
# [ 8.77496439, 6.70820393, 0. , 3. ],
# [10.29563014, 8.1240384 , 3. , 0. ]])