我有一个单词向量:
str <- c("The", "Cat", "Jumped")
我想找到所有单词组合,并在任意多个组合之间插入“+”,类似于:
paste(str, collapse = " + ")
# [1] "The + Cat + Jumped"
我想要的输出是:
want <- c("The", "Cat", "Jumped",
"The + Cat", "The + Jumped",
"Cat + Jumped",
"The + Cat + Jumped")
另请注意,我只需要组合,因此顺序并不重要,
"The + Cat"
或"Cat + The"
都可以,但我不想要两者。
我尝试了一些方法
combn
(这里),outer
(这里),expand.grid
(这里),并遵循@Akrun对类似问题的建议r - 获取不同的单词组合 但没有效果。
unlist(lapply(seq(str), combn, x=str, paste, collapse=' + '))
[1] "The" "Cat" "Jumped" "The + Cat" "The + Jumped"
[6] "Cat + Jumped" "The + Cat + Jumped"
您可以使用
combn
表示所有可能的尺寸,然后自行折叠结果
str <- c("The", "Cat", "Jumped")
Map(function(i) combn(str, i, simplify = FALSE), seq_along(str)) |>
unlist(recursive=FALSE) |>
sapply(paste, collapse=" + ")
# [1] "The" "Cat" "Jumped"
# [4] "The + Cat" "The + Jumped" "Cat + Jumped"
# [7] "The + Cat + Jumped"
你可以像下面这样用
intToBits
玩个把戏
lapply(
1:(2^length(str) - 1),
function(k) {
paste0(str[which(intToBits(k) == 1)], collapse = " + ")
}
)
这给出了
[[1]]
[1] "The"
[[2]]
[1] "Cat"
[[3]]
[1] "The + Cat"
[[4]]
[1] "Jumped"
[[5]]
[1] "The + Jumped"
[[6]]
[1] "Cat + Jumped"
[[7]]
[1] "The + Cat + Jumped"
这是 @MrFlick 代码的 tidyverse 变体:
library(tidyverse)
str <- c("The", "Cat", "Jumped")
map(1:length(str), function(i) {
combn(str, i, simplify = FALSE) %>%
map(~paste(., collapse = " + "))
}) %>%
unlist(recursive = FALSE) %>%
unlist()
[1] "The" "Cat"
[3] "Jumped" "The + Cat"
[5] "The + Jumped" "Cat + Jumped"
[7] "The + Cat + Jumped"
使用
powerSet
包中的 rje
函数:
lapply(rje::powerSet(str), paste, collapse = " + ")
#> [[1]]
#> [1] ""
#>
#> [[2]]
#> [1] "The"
#>
#> [[3]]
#> [1] "Cat"
#>
#> [[4]]
#> [1] "The + Cat"
#>
#> [[5]]
#> [1] "Jumped"
#>
#> [[6]]
#> [1] "The + Jumped"
#>
#> [[7]]
#> [1] "Cat + Jumped"
#>
#> [[8]]
#> [1] "The + Cat + Jumped"
注意第一个元素对应于空集。请参阅 wiki 中的 Power set 了解更多信息。