我正在尝试使用经度和纬度创建一个点文件,然后使用 st_join 和 st_within 将它们与人口普查区相匹配。但积分最终还是落在了堪萨斯州。如果您使用带有 API 的 tidycensus 库,那么就有可重现的代码:
虚拟数据点的代码主要位于科罗拉多州,以及科罗拉多州和堪萨斯州的区域边界:
library(tidycensus)
library(sf)
library(dplyr)
library(tidyverse)
# Set seed for reproducibility
set.seed(42)
# Generate dummy data for points in New York
points <- data.frame(
longitude = runif(300, min = -109, max = -102), # Approximate longitude boundaries of Colorado
latitude = runif(300, min = 36.993076, max = 41) # Approximate latitude boundaries of Colorado
)
# Print the first few rows of the dummy data
points <- st_as_sf(points, coords = c("longitude", "latitude"), crs = "ESRI:102003")
tract2010 <- get_decennial(geography = "tract", variables = "P001001", year = 2010,
state = as.list(c("Colorado", "Kansas")), geometry = TRUE)
tract2010$state_code <- substr(tract2010$GEOID, 1, 2)
table(tract2010$state_code)
# make same CRS
tract2010 <- st_transform(tract2010, st_crs(points))`
将其映射到传单中,以确保这些点位于正确的位置:
# test where it is
library(leaflet)
leaflet() %>%
addTiles() %>%
addMarkers(data = points)
运行连接并检查匹配项。从表中,所有点均位于州代码 20(堪萨斯州)
#spatial join
points <- st_join(points, tract2010, join = st_within)
table(points$state_code, useNA = "always")
您将点图层定义为 ESRI:102003,但原始点数据的经度和纬度采用 WGS84 或 NAD83。为了清楚起见,我复制了您的整个代码,并注释了您需要的额外步骤。以下假设您的原始点数据为 NAD83 (EPSG:4269),如果不正确,请添加正确的 EPSG 代码:
library(tidycensus)
library(sf)
library(dplyr)
library(tidyverse)
library(ggplot2)
# Set seed for reproducibility
set.seed(42)
# Generate dummy data for points
points <- data.frame(
longitude = runif(300, min = -109, max = -102), # Approximate longitude boundaries of Colorado
latitude = runif(300, min = 36.993076, max = 41) # Approximate latitude boundaries of Colorado
)
# NAD83 points to ESRI:102003
points <- st_as_sf(points, coords = c("longitude", "latitude")) %>%
st_set_crs(4269) %>% # This is the bit you missed
st_transform("ESRI:102003")
# Get census tracts
tract2010 <- get_decennial(geography = "tract", variables = "P001001", year = 2010,
state = as.list(c("Colorado", "Kansas")), geometry = TRUE)
# Create new state_code variable
tract2010$state_code <- substr(tract2010$GEOID, 1, 2)
# Transform
tract2010 <- st_transform(tract2010, st_crs(points))
# Spatial join
points <- st_join(points, tract2010_1, join = st_within)
ggplot() +
geom_sf(data = tract2010) +
geom_sf(data = points,
aes(colour = state_code))