我正在尝试在点云数据集上运行基于密度的空间聚类(DBSCAN),该数据集是一系列具有 x、y、z 坐标的点。最短距离参数之一。如何在Python中找到空间中一个点与另一个点之间的最小距离?非常感谢!
数据样本:
import numpy as np
distance = lambda p1, p2: np.sqrt(np.sum((p1 - p2) ** 2, axis=0))
我想不出比简单的 O(n²) 更好的方法来找到最小距离:
import itertools
def min_distance(cloud):
pairs = itertools.combinations(cloud, 2)
return np.min(map(lambda pair: distance(*pair), pairs))
最后,你只需要从你的文件中获取点,我假设它看起来像这样:
云.csv
x, y, z
1.2, 3.4, 2.55
2.77, 7.34, 23.4
5.66, 64.3, 4.33
def get_points(filename):
with open(filename, 'r') as file:
rows = np.genfromtxt(file, delimiter=',', skip_header=True)
return rows
最终代码import itertools
import numpy as np
distance = lambda p1, p2: np.sqrt(np.sum((p1 - p2) ** 2, axis=0))
def min_distance(cloud):
pairs = itertools.combinations(cloud, 2)
return np.min(map(lambda pair: distance(*pair), pairs))
def get_points(filename):
with open(filename, 'r') as file:
rows = np.genfromtxt(file, delimiter=',', skip_header=True)
return rows
filename = 'cloud.csv'
cloud = get_points(filename)
min_dist = min_distance(cloud)
print(min_dist)
输出
21.277006368378046
编辑Amiga500指出的那样,可以使用scipy.spatial.distance
。然后我们可以将
min_distance
重写如下:
import numpy as np
from scipy.spatial.distance import pdist
min_distance = lambda cloud: np.min(pdist(cloud))
import numpy as np
from scipy.spatial.distance import pdist
min_distance = lambda cloud: np.min(pdist(cloud))
print(min_distance(cloud))
lambda 函数返回一个函数。您需要用数据调用它才能获得结果。(我没有足够的声誉点来发表评论,所以添加此作为答案。)
对于大量点
import numpy as np
from scipy.spatial import cKDTree
# assume data stored as [x, y, z] numpy array called 'cloud'
tree = cKDTree(cloud)
nn1_dist = tree.query(cloud, k=2, workers=-1)[0][:,1]
nn1_dist = nn1_dist[nn1_dist != 0] # remove 0s
np.min(nn1_dist)