如何将所有字段/ ExtendedData(不仅仅是'name'和'description')从KML层加载到R中

问题描述 投票:1回答:1

我一直在努力将KML文件加载到R中,以使用Leaflet / Shiny制作Web地图。导入非常简单(使用this sample KML):

library(rgdal)

sampleKml <- readOGR("D:/KML_Samples.kml", layer = ogrListLayers("D:/KML_Samples.kml")[1])

在这个例子中,ogrListLayers拉入所有kml层,并且我只对第一个元素/层进行子集化。十分简单。

问题是使用此方法读取KML图层只会拉入两个字段:“名称”和“描述”,如下所示:

> sampleKml <- readOGR("D:/KML_Samples.kml", layer = ogrListLayers("D:/KML_Samples.kml")[1])
OGR data source with driver: KML 
Source: "D:/KML_Samples.kml", layer: "Placemarks"
with 3 features
It has 2 fields
> sampleKml@data
                Name                                                                                  Description
1   Simple placemark Attached to the ground. Intelligently places itself at the height of the underlying terrain.
2 Floating placemark                                                  Floats a defined distance above the ground.
3 Extruded placemark                                              Tethered to the ground by a customizable "tail" 

因此,R将KML层读取为具有3个特征(3个不同点)和2个字段(列)的SpatialPointsDataFrame。但是,当我将图层拉入QGIS并读取其属性表时,除了名称和描述之外,还有许多字段,seen here

据我所知,“名称”和“描述”是KML地标,任何其他数据都被视为ExtendedData。我想将此扩展数据与地标数据一起导入。

有没有办法将所有这些KML图层字段/属性拉入R?最好与readOGR(),但我对所有建议持开放态度。

r kml rgdal kmz ogr
1个回答
2
投票

TL;DR

底层问题是缺少libKML for windows。我的解决方案是通过函数直接从KML中提取数据。

Problem

我遇到了同样的问题,经过一些谷歌搜索后,似乎这与LibKML和Windows有关。在我的Ubuntu机器上执行相同的代码产生了不同的结果,即在加载保存的KML文件时检索到ExtendedData。

library(rgdal)
library(dplyr)
poly_df<-data.frame(x=c(1,1,0,0),y=c(1,0,0,1))
poly<-poly_df %>% 
  Polygon %>% 
  list %>% 
  Polygons(ID="1") %>% 
  list %>% 
  SpatialPolygons(proj4string = CRS("+init=epsg:4326")) %>% 
  SpatialPolygonsDataFrame(data=data.frame(test="this is a test"))

writeOGR(poly,"test.kml",driver="KML",layer="poly")
poly2<-readOGR("test.kml")
poly2@data

如果有人设法构建LibKML [1],他/她将能够使用ExtendedData [2]加载KML文件。

在Windows上,需要使用Visual Studio 2005 [1]构建LibKML。不再支持此Visual Studio版本[3]。在[3]中,user2889419提供了2005版的链接。 我下载并安装了该版本,但构建LibKML最终失败,出现了大量错误和警告(某些文件不存在)。这是我停止了因为我离开了我的舒适区但想分享我的追逐结果。

Solution in R

我的解决方案是直接读取KML,然后通过rgdal的readOGR加载空间对象时提取ExtendedData。我的假设是readOGR和我的提取例程一样在文件的顶部启动。然后两者合并,输出为SpatialPolygonsDataFrame。 起初我从KML文件中提取节点时遇到了一些麻烦,因为我不知道命名空间的概念[4]。 (编辑了以下功能,因为我遇到了其他来源的KML文件的麻烦。)

readKML <- function(file,keep_name_description=FALSE,layer,...) {
  # Set keep_name_description = TRUE to keep "Name" and "Description" columns
  #   in the resulting SpatialPolygonsDataFrame. Only works when there is
  #   ExtendedData in the kml file.

  sp_obj<-readOGR(file,layer,...)
  xml1<-read_xml(file)
  if (!missing(layer)) {
    different_layers <- xml_find_all(xml1, ".//d1:Folder") 
    layer_names <- different_layers %>% 
      xml_find_first(".//d1:name") %>% 
      xml_contents() %>% 
      xml_text()

    selected_layer <- layer_names==layer
    if (!any(selected_layer)) stop("Layer does not exist.")
    xml2 <- different_layers[selected_layer]
  } else {
    xml2 <- xml1
  }

  # extract name and type of variables

  variable_names1 <- 
    xml_find_first(xml2, ".//d1:ExtendedData") %>% 
    xml_children() 

  while(variable_names1 %>% 
        xml_attr("name") %>% 
        is.na() %>% 
        any()&variable_names1 %>%
        xml_children() %>% 
        length>0) variable_names1 <- variable_names1 %>%
    xml_children()

  variable_names <- variable_names1 %>%
    xml_attr("name") %>% 
    unique()

  # return sp_obj if no ExtendedData is present
  if (is.null(variable_names)) return(sp_obj)

  data1 <- xml_find_all(xml2, ".//d1:ExtendedData") %>% 
    xml_children()

  while(data1 %>%
        xml_children() %>% 
        length>0) data1 <- data1 %>%
    xml_children()

  data <- data1 %>% 
    xml_text() %>% 
    matrix(.,ncol=length(variable_names),byrow = TRUE) %>% 
    as.data.frame()

  colnames(data) <- variable_names

  if (keep_name_description) {
    sp_obj@data <- data
  } else {
    try(sp_obj@data <- cbind(sp_obj@data,data),silent=TRUE)
  }
  sp_obj
}

Old: extracting via ReadLines

我的解决方案是直接读取KML,然后通过rgdal的readOGR加载空间对象时提取ExtendedData。我的假设是readOGR和我的提取例程一样在文件的顶部启动。然后两者合并,输出为SpatialPolygonsDataFrame。

library(tidyverse)
library(rgdal)

readKML<-function(file,keep_name_description=FALSE,...) {
  # Set keep_name_description = TRUE to keep "Name" and "Description" columns 
  #   in the resulting SpatialPolygonsDataFrame. Only works when there is 
  #   ExtendedData in the kml file.

  if (!grepl("\\.kml$",file)) stop("File is not a KML file.")
  if (!file.exists(file)) stop("File does not exist.")
  map<-readOGR(file,...)

  f1<-readLines(file)

  # get positions of ExtendedData in document
  exdata_position<-grep("ExtendedData",f1) %>% 
    matrix(ncol=2,byrow = TRUE) %>% 
    apply(1,function(x) {
      pos<-x[1]:x[2]
      pos[2:(length(pos)-1)]
    }) %>% 
    t %>% 
    as.data.frame

  # if there is no ExtendedData return SpatialPolygonsDataFrame
  if (ncol(exdata_position)==0) return(map)

  # Get Name of different columns
  extract1<-f1[exdata_position[1,] %>% 
                 unlist]  
  names_of_data<-extract1 %>% 
    strsplit("name=\"") %>%
    lapply(function(x) strsplit(x[[2]],split="\"") ) %>%
    unlist(recursive = FALSE) %>%
    lapply(function(x) return(x[1])) %>% 
    unlist

  # Extract Extended Data
  dat<-lapply(seq(nrow(exdata_position)),function(x) {
    extract2<-f1[exdata_position[x,] %>% 
                   unlist]  
    extract2 %>% 
      strsplit(">") %>%
      lapply(function(x) strsplit(x[[2]],split="<") ) %>% unlist(recursive = FALSE) %>%
      lapply(function(x) return(x[1])) %>% 
      unlist %>% 
      matrix(nrow=1) %>% 
      as.data.frame
  }) %>% 
    do.call(rbind,.)

  # Rename columns
  colnames(dat)<-names_of_data

  # Check if Name and Description should be dropped
  if (keep_name_description) {
    map@data<-cbind(map@data,dat)
  } else {
    map@data<-dat
  }
  map
}

[1] https://github.com/google/libkml/wiki/Building-and-installing-libkml [2] https://github.com/r-spatial/sf/issues/499 [3] Where to download visual studio express 2005? [4] Parsing XML in R: Incorrect namespaces

© www.soinside.com 2019 - 2024. All rights reserved.