使用R从KML属性表中提取特定数据

问题描述 投票:0回答:1

我正在自动化从 Internet 从 KML 下载和提取数据的过程。我试图提取的值在一个大字符串中,我无法弄清楚如何获取单个需要的值。

basins <- c('CARSON - CRSN CITY L (STWN2LLF)',
            'CARSON - CRSN CITY L (STWN2LUF)',
            'EF CARSON - GRDNVL L (GRDN2LLF)',
            'EF CARSON - GRDNVL L (GRDN2LUF)',
            'EF CARSON-MRKLEEVLLE (CEMC1HLF)',
            'EF CARSON-MRKLEEVLLE (CEMC1HUF)',
            'WF CARSON - WOODFRDS (WOOC1HOF)')

date <- c('Mar_04_2023')
url_upper <- paste('https://www.cnrfc.noaa.gov/archive/sweBasins/SWEbasinsVal_', date, ".kml", sep = "")
kml_upper <- st_read(url_upper)
kml_upper <- subset(kml_upper, kml_upper$Name %in% basins)
kml_upper$geometry <- NULL
head(kml_upper,3)

数据是这样的:

我需要提取 43.81 英寸,它位于“...2023 年 3 月 4 日的模拟盆地雪水当量”之后。我无法添加特定文本,因为 SO 将其识别为格式。

提取数据的最佳方法是什么?

r xml kml
1个回答
1
投票

看起来您请求的信息作为 HTML 表存储在 xml/xml 代码中。我相信 sf 有办法提取信息。
这不是那样的。我正在使用 xml2 库从 XML 中提取 HTML。然后我用 rvest 将文本转换成 HTML,将表格提取成可用的形式。

在你的盆地列表中,我只能在下载的文件中找到 3 个。

library(dplyr)
library(rvest)
library(xml2)

page <- xml2::read_xml(url_upper)
#xml2::xml_ns(page)
xml_ns_strip(page)  #strip the name space

#find all of the placemarks and extract out
places <- page %>% xml_find_all(".//Placemark")
#get the names 
namesOfPlaces <- places %>% xml_find_first(".//name") %>% xml_text()
#find the places from the names o which are in basin list (3 in this case)
placesOfInterest <- which(namesOfPlaces %in% basins)
descriptionsInPlaces <- places %>% xml_find_all(".//description") %>% xml_text()
#reduce descriptions down to ones of interest
descriptionsInPlaces <- descriptionsInPlaces[placesOfInterest]

#loop through the list extracting the desired information
answer <- lapply(descriptionsInPlaces, function(node){
   convertHTML <- read_html(node)
   output <- convertHTML %>% html_elements("table") %>% html_table()
})
names(answer) <- namesOfPlaces[placesOfInterest]
#a list of table with the requested information

这是结果数据。我留给读者提取列表中每个数据框的第一行。

    answer
$`EF CARSON-MRKLEEVLLE (CEMC1HUF)`
$`EF CARSON-MRKLEEVLLE (CEMC1HUF)`[[1]]
# A tibble: 3 × 2
  X1                                                      X2       
  <chr>                                                   <chr>    
1 Simulated Basin Snow Water Equivalent for Mar 04, 2023: 43.81 in.
2 SWE Percent of Normal:                                  207%     
3 Average month-to-date SWE through Mar 04, 2023:         21.12 in.


$`EF CARSON - GRDNVL L (GRDN2LUF)`
$`EF CARSON - GRDNVL L (GRDN2LUF)`[[1]]
# A tibble: 3 × 2
  X1                                                      X2       
  <chr>                                                   <chr>    
1 Simulated Basin Snow Water Equivalent for Mar 04, 2023: 16.07 in.
2 SWE Percent of Normal:                                  429%     
3 Average month-to-date SWE through Mar 04, 2023:         3.75 in. 


$`CARSON - CRSN CITY L (STWN2LUF)`
$`CARSON - CRSN CITY L (STWN2LUF)`[[1]]
# A tibble: 3 × 2
  X1                                                      X2       
  <chr>                                                   <chr>    
1 Simulated Basin Snow Water Equivalent for Mar 04, 2023: 15.26 in.
2 SWE Percent of Normal:                                  307%     
3 Average month-to-date SWE through Mar 04, 2023:         4.97 in. 
© www.soinside.com 2019 - 2024. All rights reserved.