如何创建(使用R的sf)一个空间数据框,其几何列是其他两个数据框的几何列之间的差异?

问题描述 投票:0回答:1

我有两个使用 R 中的

sf
包加载的空间数据框。在第一个数据框中,我有 (61) 个美国县,而在第二个数据框中,我有这 (61) 个县中每个县的子部分。两个数据框具有相同的标识列以及这些标识列的相同值,因此对于第一个数据框中的每个县,第二个数据框中有一个且唯一的对应子部分。我的目标是:我想获得第三个空间数据框,其几何列是县及其子部分之间的差异。

我尝试基本上使用以下代码来实现上述想法:

library(tidyverse)
library(magrittr)
library(sf)

# Loading shapefiles
dfc <- st_read('data/counties.shp') %>%      # Shapefile with (61) US counties
  arrange(., st, cty)                        # Columns that uniquely identify observations
dfp <- st_read('data/county_parts.shp') %>%  # Shapefile with (61) US county parts
  arrange(., st, cty)                        # Columns that uniquely identify observations

# Geometry difference between each corresponding pair of observations
geom <- map(seq(nrow(dfc)), function(r) 
            st_difference(dfc[r, 'geo_cty'], dfp[r, 'geo_sub'])

(在上面的代码中,

geo_cty
是县几何的列,
geo_sub
是县几何子部分。按
st
cty
排序后,
r
dfc中的每一行
dfp
 
具有相同的
(st, cty)
值,因此行
geo_cty
覆盖
geo_sub
。)

上面的代码运行顺利,但是我没有成功创建一个空间数据框,其几何列是我创建的

geom
列表。例如,我尝试运行以下代码:

df <- bind_cols(dfc, select(dfp, geo_sub), geom)

我收到以下错误消息:

Error in `stop_vctrs()`:
! Can't recycle `..1` (size 61) to match `..58` (size 0).
Run `rlang::last_error()` to see where the error occurred.

如何创建所需的空间数据框?实施我所解释的想法的最佳方法是什么?

r dataframe geometry r-sf
1个回答
0
投票

我宁愿在此处使用 join 来确保几何图形正确对齐,这也可以处理数据集长度因某种原因而不同的情况(以下示例生成此类输入)。 Join 创建一个具有 2 个几何列的

sf
对象,可以使用
map2()
+
st_difference()
:

方便地进行变异
library(sf)
#> Linking to GEOS 3.11.2, GDAL 3.6.2, PROJ 9.2.0; sf_use_s2() is TRUE
library(dplyr, warn.conflicts = FALSE)
library(ggplot2)
library(patchwork)

### prepare reprex
# sf example dataset
nc <- 
  read_sf(system.file("shape/nc.shp", package="sf")) |> 
  select(CNTY_ID, NAME)

# reduce the sample to Johnston and neighbouring polygons
nc <- st_filter(nc, st_buffer(nc[nc$NAME == "Johnston",], 10))

# generate "subregions", buffers around centroids;
# drop few records (20%) to simulte a partially matching datasest
set.seed(123)
nc_sub <- nc |>
  select(-NAME) |>
  st_centroid() |>
  st_buffer(10000) |>
  slice_sample(prop = .8)

# -------------------------------------------------------------------------
# join datasets, the other sf object (left_join y arg) must be converted to 
# tibble/data.frame first; result is an sf object with 2 geometry columns 
nc <- nc_sub |>
  st_set_geometry("geo_sub") |>
  as_tibble() |>
  left_join(x = nc, y = _, by = "CNTY_ID")

# find difference between geom columns, 
# store results in 3rd geomerty column (geom_diff)
nc <- nc |>
  mutate(geo_diff = purrr::map2(geometry, geo_sub, st_difference) |> 
                    st_sfc(crs = st_crs(geometry)))


# we can change active geometry and drop other geometry columns to  get a 
# regular single-geometry sf object, if needed:
nc_diff <- st_set_geometry(nc, "geo_diff") |> select(!starts_with("geo"))

产生的

sf
对象:

nc_diff
#> Simple feature collection with 8 features and 2 fields
#> Geometry type: GEOMETRY
#> Dimension:     XY
#> Bounding box:  xmin: -79.21665 ymin: 34.55375 xmax: -77.67122 ymax: 36.26004
#> Geodetic CRS:  NAD27
#> # A tibble: 8 × 3
#>   CNTY_ID NAME                                                          geo_diff
#>     <dbl> <chr>                                                   <GEOMETRY [°]>
#> 1    1897 Franklin MULTIPOLYGON (((-78.25455 35.81553, -78.26685 35.84838, -78.…
#> 2    1913 Nash     POLYGON ((-78.20562 35.7254, -78.21155 35.73819, -78.23405 3…
#> 3    1938 Wake     POLYGON ((-78.99881 35.60132, -78.93889 35.76144, -78.94444 …
#> 4    1979 Wilson   POLYGON ((-78.1249 35.59752, -78.16245 35.68298, -78.1618 35…
#> 5    1989 Johnston MULTIPOLYGON (((-78.53874 35.31512, -78.53947 35.33646, -78.…
#> 6    2029 Wayne    POLYGON ((-78.16518 35.19322, -78.2574 35.22095, -78.31013 3…
#> 7    2030 Harnett  POLYGON ((-78.71606 35.25998, -78.81239 35.25872, -78.87457 …
#> 8    2083 Sampson  POLYGON ((-78.11374 34.69918, -78.15676 34.67715, -78.25681 …

可视化:

g1 <- ggplot(nc) + 
  geom_sf() +
  ggtitle("regions") +
  theme_bw()

g2 <- ggplot() + 
  geom_sf(data = st_union(nc), fill = NA) +
  geom_sf(data = nc_sub) +
  ggtitle("sub-regions") +
  theme_bw() +
  theme(axis.text.y = element_blank())

g3 <- ggplot(nc_diff) + 
  geom_sf() +
  ggtitle("st_difference()") +
  theme_bw() +
  theme(axis.text.y = element_blank())

wrap_plots(g1,g2,g3)

输入数据集,

nc.shp
的一个小子集:

nc
#> Simple feature collection with 8 features and 2 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -79.21665 ymin: 34.55375 xmax: -77.67122 ymax: 36.26004
#> Geodetic CRS:  NAD27
#> # A tibble: 8 × 3
#>   CNTY_ID NAME                                                          geometry
#> *   <dbl> <chr>                                               <MULTIPOLYGON [°]>
#> 1    1897 Franklin (((-78.25455 35.81553, -78.26685 35.84838, -78.30841 35.8934…
#> 2    1913 Nash     (((-78.18693 35.72511, -78.20562 35.7254, -78.21155 35.73819…
#> 3    1938 Wake     (((-78.92107 35.57886, -78.99881 35.60132, -78.93889 35.7614…
#> 4    1979 Wilson   (((-78.06533 35.58204, -78.1249 35.59752, -78.16245 35.68298…
#> 5    1989 Johnston (((-78.53874 35.31512, -78.53947 35.33646, -78.60082 35.4030…
#> 6    2029 Wayne    (((-78.16319 35.18229, -78.16518 35.19322, -78.2574 35.22095…
#> 7    2030 Harnett  (((-78.61274 35.24383, -78.71606 35.25998, -78.81239 35.2587…
#> 8    2083 Sampson  (((-78.11377 34.72099, -78.11374 34.69918, -78.15676 34.6771…
加入后

nc
(注意缺少
geo_sub
记录)和
st_difference()
,3个几何列:

nc
#> Simple feature collection with 8 features and 2 fields
#> Active geometry column: geometry
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -79.21665 ymin: 34.55375 xmax: -77.67122 ymax: 36.26004
#> Geodetic CRS:  NAD27
#> # A tibble: 8 × 5
#>   CNTY_ID NAME                                geometry                   geo_sub
#> *   <dbl> <chr>                     <MULTIPOLYGON [°]>             <POLYGON [°]>
#> 1    1897 Franklin (((-78.25455 35.81553, -78.26685 3…                     EMPTY
#> 2    1913 Nash     (((-78.18693 35.72511, -78.20562 3… ((-77.92674 35.88657, -7…
#> 3    1938 Wake     (((-78.92107 35.57886, -78.99881 3… ((-78.65973 35.87538, -7…
#> 4    1979 Wilson   (((-78.06533 35.58204, -78.1249 35… ((-77.83458 35.6446, -77…
#> 5    1989 Johnston (((-78.53874 35.31512, -78.53947 3…                     EMPTY
#> 6    2029 Wayne    (((-78.16319 35.18229, -78.16518 3… ((-77.89675 35.34607, -7…
#> 7    2030 Harnett  (((-78.61274 35.24383, -78.71606 3… ((-78.93731 35.44001, -7…
#> 8    2083 Sampson  (((-78.11377 34.72099, -78.11374 3… ((-78.29317 35.05105, -7…
#> # ℹ 1 more variable: geo_diff <GEOMETRY [°]>

创建于 2023-11-01,使用 reprex v2.0.2

© www.soinside.com 2019 - 2024. All rights reserved.