如何计算 Pandas 数据框中每个坐标对(以米为单位)到最近邻居的欧氏距离?

问题描述 投票:0回答:1

我有一个像这样的数据框

索引 地点 ID var_lat_fact var_lon_fact
0 167312091448 5.6679820000 -0.0144950000
1 167312091448 5.6686320000 -0.0157910000
2 167312091448 5.6653530000 -0.0181980000
3 167312091448 5.6700970000 -0.0191400000
4 167312091448 5.6689810000 -0.0104040000

对于每个坐标对(纬度,经度),我想计算到数据帧内最近邻居的欧几里德距离。因此,每个点都会在附加列中获得一个度量(例如,nearest_neighbour_dist),指示该距离(以米为单位)。

类似这样的事情

索引 地点 ID var_lat_fact var_lon_fact 最近邻居距离
0 167312091448 5.6679820000 -0.0144950000 123
1 167312091448 5.6686320000 -0.0157910000 342
2 167312091448 5.6653530000 -0.0181980000 312
3 167312091448 5.6700970000 -0.0191400000 42
4 167312091448 5.6689810000 -0.0104040000 23

我实在无法理解这个问题…… 任何帮助将不胜感激。

python pandas coordinates nearest-neighbor euclidean-distance
1个回答
0
投票
import pandas as pd
from io import StringIO
from scipy.spatial import KDTree

# Load test data
s = """
place_id,var_lat_fact,var_lon_fact
167312091448 5.6679820000 -0.0144950000
167312091448 5.6686320000 -0.0157910000
167312091448 5.6653530000 -0.0181980000
167312091448 5.6700970000 -0.0191400000
167312091448 5.6689810000 -0.0104040000
""".replace(' ', ',')

df = pd.read_csv(StringIO(s))

# Create kd Tree
points = df[['var_lat_fact', 'var_lon_fact']].values
kd = KDTree(points)

# Compute the closest two neighbors for each point
distances, indexes = kd.query(points, k=2)

# Discard the first 'neighbor' (the point itself, i.e. distance=0),
# And select the second.
df['nearest_neighbour_dist'] = distances[:, 1]

print(df)
       place_id  var_lat_fact  var_lon_fact  nearest_neighbour_dist
0  167312091448      5.667982     -0.014495                0.001450
1  167312091448      5.668632     -0.015791                0.001450
2  167312091448      5.665353     -0.018198                0.004068
3  167312091448      5.670097     -0.019140                0.003655
4  167312091448      5.668981     -0.010404                0.004211


© www.soinside.com 2019 - 2024. All rights reserved.