这能矢量(numpy的)?

问题描述 投票:1回答:1

我有特征向量的列表,并想以计算特征向量的距离L2向所有其他特征向量,作为唯一性的措施。这里,给出了min_distances[i]第i个特征向量的L2范数。

import numpy as np

# Generate data
nrows = 2000
feature_length = 128
feature_vecs = np.random.rand(nrows, feature_length)

# Calculate min L2 norm from each feature vector
# to all other feature vectors
min_distances = np.zeros(nrows)
indices = np.arange(nrows)
for i in indices:
    min_distances[i] = np.min(np.linalg.norm(
        feature_vecs[i != indices] - feature_vecs[i],
        axis=1))

作为为O(n ^ 2)它是痛苦的缓慢,并想优化它。我可以摆脱的for循环/矢量化这使得minlinalg.norm被称为只有一次?

python numpy vectorization
1个回答
2
投票

方法#1

这里有一个与cdist -

from scipy.spatial.distance import cdist,pdist,squareform

d = squareform(pdist(feature_vecs))
np.fill_diagonal(d,np.nan)
min_distances = np.nanmin(d,axis=0)

方法2

另一个与cKDTree -

from scipy.spatial import cKDTree

min_distances = cKDTree(feature_vecs).query(feature_vecs, k=2)[0][:,1]
© www.soinside.com 2019 - 2024. All rights reserved.