如何从R中的列表中提取经度和纬度信息

问题描述 投票:0回答:2

我需要您的帮助从列表中提取经度和纬度信息。我有很多具体地址,我使用这个网站来获取每个地址的纬度和经度,https://geocoding.geo.census.gov/geocoder/geographies/onelineaddress。这是我的代码:

fetch_geocodes <- function(address) {
  # Specify the API endpoint
  base_url <- "https://geocoding.geo.census.gov/geocoder/geographies/onelineaddress"
  
  # Specify the parameters to pass to the API
  params <- list(
    address = address,
    benchmark = "Public_AR_Current",  
    vintage = "Current_Current",
    format = "json"
  )
  
  # Send a GET request to the API
  response <- GET(url = base_url, query = params)
  
  # Check if the request was successful
  if (status_code(response) == 200) {
    # Parse the response to JSON
    data <- content(response, "parsed")
    
    # Print the entire JSON response
    print(data)
    
    # Extract the longitude and latitude
    longitude <- data$result$addressMatches$coordinates$x
    latitude <- data$result$addressMatches$coordinates$y
    
    return(c(longitude, latitude))
  } else {
    stop("Request failed with status ", status_code(response))
  }
}

addresses <- c("Riverside Dr, Apple Valley, CA, 92307",
               "11 Wall Street, New York, NY 10005")
geocodes <- lapply(addresses, fetch_geocodes)

这是我的部分输出,因为整个输出很长:

$result
$result$input
$result$input$address
$result$input$address$address
[1] "Riverside Dr, Apple Valley, CA, 92307"


$result$input$vintage
$result$input$vintage$isDefault
[1] TRUE

$result$input$vintage$id
[1] "4"

$result$input$vintage$vintageName
[1] "Current_Current"

$result$input$vintage$vintageDescription
[1] "Current Vintage - Current Benchmark"


$result$input$benchmark
$result$input$benchmark$isDefault
[1] TRUE

$result$input$benchmark$benchmarkDescription
[1] "Public Address Ranges - Current Benchmark"

$result$input$benchmark$id
[1] "4"

$result$input$benchmark$benchmarkName
[1] "Public_AR_Current"



$result$addressMatches
list()

$result
$result$input
$result$input$address
$result$input$address$address
[1] "11 Wall Street, New York, NY 10005"

$result$addressMatches[[1]]$coordinates
$result$addressMatches[[1]]$coordinates$x
[1] -74.01073

$result$addressMatches[[1]]$coordinates$y
[1] 40.70714

对于第一个地址,Riverside Dr, Apple Valley, CA, 92307,它不会从网站中提取经度和纬度,我需要将 NA 分配给“经度”和“纬度”列。对于第二个地址,$result$addressMatches[[1]]$cooperatives 提供经度和纬度信息。但是,我不知道如何从geocodes中提取相应的信息,因为它返回NULL。

print(geocodes)
[[1]]
NULL

[[2]]
NULL

我不明白如何处理它。非常感谢您的帮助。我的目标是获得一个包含三列的数据框,第一列是full_address,第二列是经度,第三列是纬度。

r list latitude-longitude
2个回答
1
投票

前面:

data$result$addressMatches
是一个
list
,每个元素可能有
coordinates
,所以你可以做类似
data$result$addressMatches[[1]]$coordinates$x
的事情。

如果保证返回结果中始终只有一个 x/y,那么您可以:

unlist(data$result$addressMatches[[1]]$coordinates)
#         x         y 
# -74.01073  40.70714 

但是,如果你可以得到两个或更多,那么你需要返回一个

list
data.frame
,并且你需要更多的工作:

L <- lapply(data$result$addressMatches, function(z) {
  if ("coordinates" %in% names(z)) unlist(z$coordinates) else c(x=NA_real_,y=NA_real_)
})
list(x=sapply(L, `[[`, 1), y=sapply(L, `[[`, 2))
# $x
# [1] -74.01073
# $y
# [1] 40.70714

使用第一个假设,那么

fetch_geocodes <- function(address) {
  # Specify the API endpoint
  base_url <- "https://geocoding.geo.census.gov/geocoder/geographies/onelineaddress"
  
  # Specify the parameters to pass to the API
  params <- list(
    address = address,
    benchmark = "Public_AR_Current",  
    vintage = "Current_Current",
    format = "json"
  )
  
  # Send a GET request to the API
  response <- GET(url = base_url, query = params)
  
  # Check if the request was successful
  if (status_code(response) == 200) {
    # Parse the response to JSON
    data <- content(response, "parsed")
    
    ### Print the entire JSON response
    # print(data)
    
    # Extract the longitude and latitude
    if (length(data$result$addressMatches) > 0) {
      longitude <- data$result$addressMatches[[1]]$coordinates$x
      if (is.null(longitude)) longitude <- NA_real_
      latitude <- data$result$addressMatches[[1]]$coordinates$y
      if (is.null(latitude)) latitude <- NA_real_
    } else {
      longitude <- latitude <- NA_real_
    }
    
    return(c(longitude, latitude))
  } else {
    stop("Request failed with status ", status_code(response))
  }
}
lapply(addresses, fetch_geocodes)
# [[1]]
# [1] NA NA
# [[2]]
# [1] -74.01073  40.70714

0
投票

tidygeocoder 包非常适合此目的。它支持多种地理编码 服务,包括您正在使用的美国人口普查服务。

library(tidygeocoder)
addresses <- c("Riverside Dr, Apple Valley, CA, 92307",
               "11 Wall Street, New York, NY 10005")

adr_df <- data.frame(address = addresses)

默认情况下

tidycensus
使用 OSM 地理编码器提名。它 查找两个示例地址的坐标。

adr_df |>
  geocode(address = address)
#> Passing 2 addresses to the Nominatim single address geocoder
#> Query completed in: 2.8 seconds
#> # A tibble: 2 × 3
#>   address                                 lat   long
#>   <chr>                                 <dbl>  <dbl>
#> 1 Riverside Dr, Apple Valley, CA, 92307  34.5 -117. 
#> 2 11 Wall Street, New York, NY 10005     40.7  -74.0

尝试人口普查地理编码器,我们看到这里的第一个地址也是 不会产生任何坐标。

adr_df |>
  geocode(address = address,
          method = "census")
#> Passing 2 addresses to the US Census batch geocoder
#> Query completed in: 0.6 seconds
#> # A tibble: 2 × 3
#>   address                                 lat  long
#>   <chr>                                 <dbl> <dbl>
#> 1 Riverside Dr, Apple Valley, CA, 92307  NA    NA  
#> 2 11 Wall Street, New York, NY 10005     40.7 -74.0
© www.soinside.com 2019 - 2024. All rights reserved.