我有一个xml文件,如下所示:
library(tidyverse)
library(xml2)
x <- read_xml('<root>
<group id= "1">
<subgroup>bla</subgroup>
<subgroup>bla2</subgroup>
<subgroup>bla3</subgroup>
</group>
<group id="2">
<subgroup>qsdfbla</subgroup>
<subgroup>bla2qsdf</subgroup>
<subgroup>bla3qfsd</subgroup>
<subgroup>qsdfqfsd</subgroup>
</group>
</root>')
我想在所有子组节点中添加一个id属性,即每个组内的seq。我希望第一个值为1,然后是2,然后是3,然后在第二个组中再次从1开始。
我试过了:
x %>%
xml_find_all('//group') %>%
map(~xml_children(.) %>% xml_set_attr("idSubGroup",seq_along(.)))
但我设法做的就是在每个idSubGroup属性中加1。我怎么能真的“seq along”?
这是一个基本上是循环内循环的解决方案。找到组节点然后逐个查找子节点并分别更新它们中的每一个。我不相信这个步骤可以矢量化。
我在这里使用了lapply
和sapply
函数,但如果需要可以转换为purrr包。
library(xml2)
library(tidyverse)
x <- read_xml('<root>
<group id= "1">
<subgroup>bla</subgroup>
<subgroup>bla2</subgroup>
<subgroup>bla3</subgroup>
</group>
<group id="2">
<subgroup>qsdfbla</subgroup>
<subgroup>bla2qsdf</subgroup>
<subgroup>bla3qfsd</subgroup>
<subgroup>qsdfqfsd</subgroup>
</group>
</root>')
#find all of the group nodes
groups<-x %>% xml_find_all('//group')
lapply(groups, function(group){
#find all of the children nodes in each group
cnodes<-group %>% xml_children(.)
#loop through each child node and add subgroup number
sapply(1:length(cnodes), function(node){cnodes[node] %>% xml_set_attr("idSubGroup",node) })
})
print(x)
# {xml_document}
# <root>
# [1] <group id="1">\n <subgroup idSubGroup="1">bla</subgroup>\n <subgroup idSubGroup="2">bla2</subgroup>\n < ...
# [2] <group id="2">\n <subgroup idSubGroup="1">qsdfbla</subgroup>\n <subgroup idSubGroup="2">bla2qsdf</subgro ...