使用R xml2和purrr如何在组中添加id属性排序?

问题描述 投票:1回答:1

我有一个xml文件,如下所示:

library(tidyverse)
library(xml2)

x <- read_xml('<root>
                <group id= "1">
                  <subgroup>bla</subgroup>
                  <subgroup>bla2</subgroup>
                  <subgroup>bla3</subgroup>
                </group>
                <group id="2">
                  <subgroup>qsdfbla</subgroup>
                  <subgroup>bla2qsdf</subgroup>
                  <subgroup>bla3qfsd</subgroup>
                  <subgroup>qsdfqfsd</subgroup>
                </group>
              </root>')

我想在所有子组节点中添加一个id属性,即每个组内的seq。我希望第一个值为1,然后是2,然后是3,然后在第二个组中再次从1开始。

我试过了:

x %>%
  xml_find_all('//group') %>%
  map(~xml_children(.) %>% xml_set_attr("idSubGroup",seq_along(.)))

但我设法做的就是在每个idSubGroup属性中加1。我怎么能真的“seq along”?

r xml purrr
1个回答
1
投票

这是一个基本上是循环内循环的解决方案。找到组节点然后逐个查找子节点并分别更新它们中的每一个。我不相信这个步骤可以矢量化。 我在这里使用了lapplysapply函数,但如果需要可以转换为purrr包。

library(xml2)
library(tidyverse)

x <- read_xml('<root>
              <group id= "1">
              <subgroup>bla</subgroup>
              <subgroup>bla2</subgroup>
              <subgroup>bla3</subgroup>
              </group>
              <group id="2">
              <subgroup>qsdfbla</subgroup>
              <subgroup>bla2qsdf</subgroup>
              <subgroup>bla3qfsd</subgroup>
              <subgroup>qsdfqfsd</subgroup>
              </group>
              </root>')


#find all of the group nodes
groups<-x %>%   xml_find_all('//group')

lapply(groups, function(group){
   #find all of the children nodes in each group
   cnodes<-group %>% xml_children(.)
   #loop through each child node and add subgroup number
   sapply(1:length(cnodes), function(node){cnodes[node] %>% xml_set_attr("idSubGroup",node) })
})

print(x)
# {xml_document}
# <root>
# [1] <group id="1">\n  <subgroup idSubGroup="1">bla</subgroup>\n  <subgroup idSubGroup="2">bla2</subgroup>\n  < ...
# [2] <group id="2">\n  <subgroup idSubGroup="1">qsdfbla</subgroup>\n  <subgroup idSubGroup="2">bla2qsdf</subgro ...
© www.soinside.com 2019 - 2024. All rights reserved.