我想从.kml文件中提取要使用R描述的值。
这里是文件:
<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2"
xmlns:gx="http://www.google.com/kml/ext/2.2"
xmlns:atom="http://www.w3.org/2005/Atom">
<Document>
<open>1</open>
<visibility>1</visibility>
<name><![CDATA[2013-07-06 4:18pm]]></name>
...
<Placemark>
<name><![CDATA[2013-07-06 4:18pm (Start)]]></name>
<description><![CDATA[]]></description>
<TimeStamp><when>2013-07-06T20:18:56.000Z</when></TimeStamp>
<styleUrl>#start</styleUrl>
<Point>
<coordinates>-78.353348,45.020615,340.29998779296875</coordinates>
</Point>
</Placemark>
<Placemark id="tour">
<name><![CDATA[2013-07-06 4:18pm]]></name>
<description><![CDATA[]]></description>
...
<gx:Track>
<when>2013-07-06T20:18:56.000Z</when>
<gx:coord>-78.353348 45.020615 340.29998779296875</gx:coord>
<when>2013-07-06T20:19:12.000Z</when>
<gx:coord>-78.353315 45.020644 340.29998779296875</gx:coord>
<when>2013-07-06T22:12:23.000Z</when>
<gx:coord>-78.353108 45.020736 342.29998779296875</gx:coord>
<ExtendedData>
...
<Placemark>
<name><![CDATA[2013-07-06 4:18pm (End)]]></name>
<description><![CDATA[Created by Google My Tracks on Android.
Name: 2013-07-06 4:18pm
Activity type: cycling
Description: -
Total distance: 49.62 km (30.8 mi)
Total time: 1:53:28
Moving time: 1:50:17
Average speed: 26.24 km/h (16.3 mi/h)
Average moving speed: 27.00 km/h (16.8 mi/h)
Max speed: 61.20 km/h (38.0 mi/h)
Average pace: 2.29 min/km (3.7 min/mi)
Average moving pace: 2.22 min/km (3.6 min/mi)
Fastest pace: 0.98 min/km (1.6 min/mi)
Max elevation: 406 m (1333 ft)
Min elevation: 265 m (868 ft)
Elevation gain: 690 m (2263 ft)
Max grade: 12 %
Min grade: -11 %
Recorded: 2013-07-06 4:18pm
]]></description>
...
</Placemark>
</Document>
</kml>
这是我想提取的内容,包含在]中>
<description><![CDATA[Created by Google My Tracks on Android.: ]]></description>
即:
Name: 2013-07-06 4:18pm Activity type: cycling Description: - Total distance: 49.62 km (30.8 mi) Total time: 1:53:28 Moving time: 1:50:17 Average speed: 26.24 km/h (16.3 mi/h) Average moving speed: 27.00 km/h (16.8 mi/h) Max speed: 61.20 km/h (38.0 mi/h) Average pace: 2.29 min/km (3.7 min/mi) Average moving pace: 2.22 min/km (3.6 min/mi) Fastest pace: 0.98 min/km (1.6 min/mi) Max elevation: 406 m (1333 ft) Min elevation: 265 m (868 ft) Elevation gain: 690 m (2263 ft) Max grade: 12 % Min grade: -11 % Recorded: 2013-07-06 4:18p
xmlToList给我,我认为为NULL,因为CDATA标记表示以下内容未被解析器处理:
xml <- xmlTreeParse("test1.kml", useInternalNodes=TRUE) xmllist <- xmlToList(xml) xmllist$Document$Placemark$description [[1]] NULL
[我认为这是this的含义,“术语CDATA用于不应由XML解析器解析的文本数据...解析器忽略CDATA节中的所有内容。CDATA节以”开头” “
以下内容对我也不起作用,也许出于与CDATA相关的相同原因:
z1 <- xpathApply(xml, "//description", xmlValue) z1 list()
任何人都可以帮助我提取文件中的文本吗?
这里是文件的链接:https://docs.google.com/file/d/0B__iOdFGJbXYOHJGbWJVNW0tS3M/edit?usp=sharing
我想从.kml文件中提取要使用R描述的值。这是文件:
doc <- xmlTreeParse("test1.kml", useInternalNodes = TRUE)
root <-xmlRoot(doc)
xmlValue(root[["Document"]][["name"]])
R> xmlValue(root[["Document"]][["name"]])
[1] "2013-07-06 4:18pm"
Jake Burkhead在评论中回答了这个问题。他的解决方案做到了。我对此深表感谢。这是从.kml文件中提取文本的方式:
解决此问题的一个好方法是使用xml2
包读取数据。