R(空间距离计算)性能缓慢

问题描述 投票:0回答:1

我正在 R 中使用两个 shapefile:一个代表 5841 个地点(作为点),另一个代表道路(作为线 - 总共 99 个观测值)。我的目标是计算从每个地点到最近道路的最小距离,并确定道路上与该最小距离相对应的确切纬度/经度位置。本质上,对于每个点,我想找到任何线上最近的点并测量这两点之间的距离。

这是我采用的方法,使用 sf 包:

library(sf)

# Load the shapefiles
localities <- st_read("path/to/localities.shp")
roads <- st_read("path/to/roads.shp")

# Ensure CRS match, transform if necessary
if (st_crs(localities) != st_crs(roads)) {
  roads <- st_transform(roads, st_crs(localities))
}

# Calculate nearest roads and distances
nearest_roads_info <- st_nearest_feature(localities, roads)
distances_to_nearest_road <- st_distance(localities, roads[nearest_roads_info, , drop = FALSE])
closest_points <- st_nearest_points(localities, roads[nearest_roads_info, , drop = FALSE])

# Extracting closest points on roads
closest_points_on_roads <- st_cast(closest_points, "POINT")[,2]

# Assigning distances and coordinates to the localities data frame
localities$distance_to_nearest_road <- distances_to_nearest_road
coords <- st_coordinates(st_sfc(closest_points_on_roads, crs = st_crs(localities)))
localities$closest_road_lat <- coords[, "Y"]
localities$closest_road_lon <- coords[, "X"]

问题是这段代码需要永远运行。我怀疑性能缓慢可能是由于我计算距离和提取最近道路点的方式造成的,但我不确定如何最好地优化这些操作。我可以遵循任何建议、技巧或替代方法吗?

谢谢!

r optimization gis spatial
1个回答
0
投票

这是一种相对快速且简单的方法,涉及沿每条道路定期创建点,然后使用这些点作为最近线边缘的代理。它不一定会为您提供一条线的最近部分的精确位置,但它在计算上非常“轻”。此示例沿每条道路每隔 0.1 米放置一个点,但可以使用更精细的分辨率来提高准确性。

此示例只有 5 条道路,约 23 公里。为了提供一些有关效率的背景信息,使用 5,841 个位置和 336,901 个道路点,在一台非常普通的计算机上计算距离大约需要 40 秒。

示例数据和所需包:

library(sf)
library(lwgeom)
library(dplyr)

set.seed(1) # For reproducibility

pt_cnt <- 5841
localities <- data.frame(location_id = 1:pt_cnt,
                         latitude = runif(pt_cnt, min = 42, max = 42.05),
                         longitude = runif(pt_cnt, min = 18, max = 18.05)) %>%
  st_as_sf(coords = c("longitude", "latitude"), crs = 4326)

roads <- data.frame(road_name = c(rep(1:3, each = 6), 
                                  rep(4:5, each = 3)),
                    latitude = c(rep(c(42.001, 42.022, 42.05), each = 6),
                                 rep(c(42.001, 42.022, 42.05), 2)),
                    longitude = c(rep(c(18.001, 18.022, 18.023,
                                        18.024, 18.025, 18.049), 3),
                                  rep(c(18.001, 18.049), each = 3))) %>%
  st_as_sf(coords = c("longitude", "latitude"), crs = 4326) %>%
  group_by(road_name) %>%
  summarise(geometry = st_combine(geometry)) %>%
  st_cast("LINESTRING")

生成道路的 POINT 版本,每 n 个单位有一个点。此步骤将返回警告:

警告信息: 在 st_cast.sf(., "POINT") 中: 所有子几何体的重复属性,它们可能不是恒定的

您可以安全地忽略它。

p <- st_split(roads, p) %>%
  st_collection_extract(., "LINESTRING") %>%
  st_segmentize(., units::set_units(0.1, m)) %>% # == st_segmentize(., 1)
  st_cast("POINT") %>%
  mutate(road_point = 1:n())

p
Simple feature collection with 336901 features and 2 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 18.001 ymin: 42.001 xmax: 18.049 ymax: 42.05
Geodetic CRS:  WGS 84
# A tibble: 336,901 × 3
   road_name          geometry road_point
 *     <int>       <POINT [°]>      <int>
 1         1   (18.001 42.001)          1
 2         1   (18.001 42.001)          2
 3         1   (18.001 42.001)          3
 4         1   (18.001 42.001)          4
 5         1   (18.001 42.001)          5
 6         1   (18.001 42.001)          6
 7         1   (18.001 42.001)          7
 8         1   (18.001 42.001)          8
 9         1 (18.00101 42.001)          9
10         1 (18.00101 42.001)         10
# ℹ 336,891 more rows
# ℹ Use `print(n = ...)` to see more rows

最后,找到最近的道路点并计算距离:

result <- localities %>%
  mutate(road_point = st_nearest_feature(., p)) %>%
  left_join(., data.frame(p), by = join_by(road_point)) %>%
  mutate(distance_m = st_distance(geometry.x, geometry.y, by_element = TRUE))

result

Simple feature collection with 5841 features and 4 fields
Active geometry column: geometry.x
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 18.00001 ymin: 42.00001 xmax: 18.04995 ymax: 42.05
Geodetic CRS:  WGS 84
First 10 features:
   location_id road_point road_name                geometry.x              geometry.y   distance_m
1            1     224983         4 POINT (18.01112 42.01328) POINT (18.001 42.01328) 836.3658 [m]
2            2      80540         2 POINT (18.00865 42.01861) POINT (18.00865 42.022) 377.4238 [m]
3            3      89274         2 POINT (18.01424 42.02864) POINT (18.01424 42.022) 738.5826 [m]
4            4     191940         3 POINT (18.03883 42.04541)  POINT (18.03883 42.05) 510.4101 [m]
5            5     285540         5 POINT (18.04332 42.01008) POINT (18.049 42.01008) 469.5687 [m]
6            6     166970         3 POINT (18.02007 42.04492)  POINT (18.02007 42.05) 564.9463 [m]
7            7     155291         3 POINT (18.01258 42.04723)  POINT (18.01258 42.05) 307.6446 [m]
8            8     317053         5 POINT (18.04298 42.03304) POINT (18.049 42.03304) 496.8863 [m]
9            9     249662         4 POINT (18.01076 42.03146) POINT (18.001 42.03146) 806.2031 [m]
10          10      52214         1 POINT (18.03699 42.00309) POINT (18.03699 42.001) 232.2519 [m]
© www.soinside.com 2019 - 2024. All rights reserved.