R 中带有 rep 和 dplyr 管道的 mapply 的奇怪行为

问题描述 投票:0回答:3

我正在处理具有两个分隔符“*”和“|”的字符串,它们用于以下字符串:

"3\*4|2\*7.4|8\*3.2"

“*”前的数字表示频率,“*”后的浮点数或整数表示值。这些值频率对使用“|”分隔。

所以从

"3\*4|2\*7.4|8\*3.2"
,我想得到以下向量:

"4","4","4","7.4","7.4","3.2","3.2","3.2","3.2","3.2","3.2","3.2","3.2"

我想出了以下语法,没有错误和警告,但最终结果与预期不同:

strsplit("3*4|2*7.4|8*3.2", "[*|]") %>% #Split into a vector with two different separator characters
  unlist %>% #strsplit returns a list, so let's unlist it
         mapply(FUN = rep,
                x = .[seq(from = 2, to = length(.), by = 2)], #these sequences mean even and odd index in this respect
                times = .[seq(from = 1, to = length(.), by = 2)], #rep() flexibly accepts times argument also as string
                USE.NAMES = FALSE) %>%
         unlist #mapply returns a list, so let's unlist it

[1] "4"   "4"   "4"   "7.4" "7.4" "7.4" "7.4" "3.2" "3.2" "4"   "4"   "4"   "4"   "4"   "4"   "4"   "7.4" "7.4" "7.4" "7.4" "7.4" "7.4" "7.4" "7.4" "3.2" "3.2" "3.2"

如您所见,发生了一些奇怪的事情。 “4”重复了三次,这是正确的,但是“7.4”重复了四次(错误)等等。

这是怎么回事?

r dplyr mapply rep
3个回答
1
投票

你可以分两步使用

lapply

"3*4|2*7.4|8*3.2" %>% strsplit("[|]") %>%
                      unlist %>%
                      strsplit("[*]") %>%
                      lapply(function(x) rep(x[2],x[1])) %>%
                      unlist

# [1] "4"   "4"   "4"   "7.4" "7.4" "3.2" "3.2" "3.2" "3.2" "3.2" "3.2" "3.2" "3.2"

0
投票

您可以用

|
代替换行符,将数据读入数据框并将其传递给
rep()

do.call(rep,
        read.delim(text = gsub("\\|", "\n", "3*4|2*7.4|8*3.2"),
                   sep = "*",
                   header = FALSE,
                   col.names = c("times", "x"))
        )

[1] 4.0 4.0 4.0 7.4 7.4 3.2 3.2 3.2 3.2 3.2 3.2 3.2 3.2

0
投票

1) 下面的一行匹配两个数字,并将它们作为单独的参数传递给使用公式表示法指定的匿名函数,返回函数的输出。输入

x
来自问题,并在最后的注释中明确定义。

library (gsubfn)

strapply(x, "([0-9]+)\\*([0-9.]+)", n + x ~ rep(x, as.numeric(n)))[[1]]
## [1] "4" "4" "4" "7.4" "7.4" "3.2" "3.2" "3.2" "3.2" "3.2" "3.2" "3.2" "3.2"

如果我们有一个像

x
这样的字符串的字符向量,那么它也可以通过删除
[[1]]
来工作。在这种情况下,它将返回结果列表。

xx <- c(x, x)
strapply(xx, "([0-9]+)\\*([0-9.]+)", n + x ~ rep(x, as.numeric(n)))
## [[1]]
## [1] "4" "4" "4" "7.4" "7.4" "3.2" "3.2" "3.2" "3.2" "3.2" "3.2" "3.2" "3.2"
##
## [[2]]
## [1] "4" "4" "4" "7.4" "7.4" "3.2" "3.2" "3.2" "3.2" "3.2" "3.2" "3.2" "3.2"

2)另一种方法是分别提取重复次数和值,并将每个这样的向量传递给

rep
.

library(gsubfn)

rep(strapplyc(x, "\\*([0-9.]+)")[[1]], 
  strapply(x, "(\\d+)\\*", as.numeric)[[1]])
  ## [1] "4" "4" "4" "7.4" "7.4" "3.2" "3.2" "3.2" "3.2" "3.2" "3.2" "3.2" "3.2"

注意

使用的输入是:

x <- "3*4|2*7.4|8*3.2"
© www.soinside.com 2019 - 2024. All rights reserved.