使用函数使用查找列表完成不完全链接的文档（文档树）

Question

我已将链接文档（文档树）保存在列表中（

list

）

有些文件树有不完整的项目（用

seach=1

标记）。有些树可能有多个不完整的树，这些树被标记为

search=1

.

我想使用包含文档树的查找列表扩展/完成这些不完整的树（

list_lookup

），列表中总是只有一个匹配的树

list_lookup

。匹配文档树的

level

要调整为

list

中的文档树。

样本数据和所需的输出：

library(tidyverse)

# initial df1, aaa is incomplete (it is in fact linked to other documents, but this information is stored in the lookup list)
 
df1 <- tibble(id_from=c(NA_character_,"111","222","333","444","444","bbb"),
             id_to=c("111","222","333","444","aaa","bbb","ccc"),
             level=c(0,1,2,3,4,4,5),
             search=c(0,0,0,0,1,0,0))
df1
#> # A tibble: 7 × 4
#>   id_from id_to level search
#>   <chr>   <chr> <dbl>  <dbl>
#> 1 <NA>    111       0      0
#> 2 111     222       1      0
#> 3 222     333       2      0
#> 4 333     444       3      0
#> 5 444     aaa       4      1
#> 6 444     bbb       4      0
#> 7 bbb     ccc       5      0


# lookup dfs, df2 contains the further document links of aaa
df2 <- tibble(id_from=c(NA,"aaa","x","x"),
             id_to=c("aaa","x","x1","x2"),
             level=c(0,1,2,2))

df3 <- tibble(id_from=c(NA,"thank"),
                     id_to=c("thank","you"),
                     level=c(0,1))

#list with df
list <- list(df1)

#list with lookups
list_lookup <- list(df2,df3)

list_lookup
#> [[1]]
#> # A tibble: 4 × 3
#>   id_from id_to level
#>   <chr>   <chr> <dbl>
#> 1 <NA>    aaa       0
#> 2 aaa     x         1
#> 3 x       x1        2
#> 4 x       x2        2
#> 
#> [[2]]
#> # A tibble: 2 × 3
#>   id_from id_to level
#>   <chr>   <chr> <dbl>
#> 1 <NA>    thank     0
#> 2 thank   you       1

#what I need; an updated list of dfs where information from the lookup list are included

df1_wanted <- tibble(id_from=c(NA_character_,"111","222","333","444","444","aaa","bbb","x","x"),
                     id_to=c("111","222","333","444","aaa","bbb","x","ccc","x1","x1"),
                     level=c(0,1,2,3,4,4,5,5,6,6))

list(df1_wanted)
#> [[1]]
#> # A tibble: 10 × 3
#>    id_from id_to level
#>    <chr>   <chr> <dbl>
#>  1 <NA>    111       0
#>  2 111     222       1
#>  3 222     333       2
#>  4 333     444       3
#>  5 444     aaa       4
#>  6 444     bbb       4
#>  7 aaa     x         5  <- added from df2, level adjusted
#>  8 bbb     ccc       5  
#>  9 x       x1        6  <- added from df2, level adjusted
#> 10 x       x1        6  <- added from df2, level adjusted

^{创建于 2023-04-01 与 reprex v2.0.2}

我的做法：

我想过用

purrr::map

将一个函数映射到

list

的每一项，但是，我不确定这个函数应该是什么样子。

Answer 1

在这个解决方案中：

我首先定义一个递归函数
```
get_tree()
```
，它采用单个
```
id
```
和查找表，并从表中返回该
```
id
```
的完整树。
然后，我定义了一个函数，
```
complete_tree()
```
，它接受一个数据框和一个查找表列表，为每个
```
get_tree()
```
迭代
```
id_to
```
，其中
```
search == 1
```
和每个查找表，调整
```
level
```
，并绑定结果到初始数据框。
最后，我为
```
complete_tree()
```
的每个元素迭代
```
list
```
。

get_tree <- function(id, lookup) {
  branch <- filter(lookup, id_from == id)
  if (nrow(branch) == 0) return()
  bind_rows(
    branch, 
    map(branch$id_to, \(x) get_tree(x, lookup))
  )
}

complete_trees <- function(data, lookups) {
  branches <- pmap(
    filter(data, search == 1),
    \(id_to, level, ...) {
      bind_rows(map(
          lookups, 
          \(lookup) get_tree(id_to, lookup)
        )) %>%
        mutate(level = level + .env$level)
    }
  )
  bind_rows(data, branches) %>%
    select(!search) %>%
    arrange(level, id_from)
}

map(list, \(x) complete_trees(x, lookups = list_lookup))

结果：

[[1]]
# A tibble: 10 × 3
   id_from id_to level
   <chr>   <chr> <dbl>
 1 <NA>    111       0
 2 111     222       1
 3 222     333       2
 4 333     444       3
 5 444     aaa       4
 6 444     bbb       4
 7 aaa     x         5
 8 bbb     ccc       5
 9 x       x1        6
10 x       x2        6

使用函数使用查找列表完成不完全链接的文档（文档树）

问题描述投票：0回答：1

1个回答

最新问题

使用函数使用查找列表完成不完全链接的文档（文档树）

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1