dplyr::mutate 中的向量化函数和逻辑运算符

问题描述 投票:0回答:1

我正在尝试向量化一个函数以在

dplyr::mutate
中使用。对于我的一生,我无法让它发挥作用。这就是我一直在做的事情:

str_to_seq <- Vectorize(function(x) {
  
  # This function converts text format year ranges (e.g. "1970 - 1979") to 
  # numeric ranges. Handily works with single values and edge cases such as 
  # "- 1920".
  
  res <- stringr::str_extract_all(x, "\\d+") %>% 
    unlist() %>% 
    {seq(dplyr::first(.), dplyr::last(.))}
  
  return(res)
  
}, vectorize.args = "x", SIMPLIFY = F)

year <- c(1970, 1980, 1990, 2000, 2010, 2020)
agegroup <- "1950 - 1959"

testt <- expand.grid(agegroup = agegroup, year = year, stringsAsFactors = F)

testt %>% 
  as_tibble() %>% 
  dplyr::mutate(
    yearminus50 = year - 50,
    statement = all(yearminus50 >= str_to_seq(agegroup)))

statement
列失败并显示错误消息

Error in `dplyr::mutate()`:
ℹ In argument: `statement = all(yearminus50 >= str_to_seq(agegroup))`.
Caused by error:
! 'list' object cannot be coerced to type 'double'
Run `rlang::last_trace()` to see where the error occurred.

我无法让我的函数

str_to_seq
来创建普通向量。输出似乎是一个列表。

statement
应该是
c(FALSE, FALSE, FALSE, FALSE, TRUE, TRUE)
,正如我们通过这个暴力代码所看到的:

all(year[1] - 50 >= unlist(str_to_seq(agegroup)[[1]]))
all(year[2] - 50 >= unlist(str_to_seq(agegroup)[[1]]))
all(year[3] - 50 >= unlist(str_to_seq(agegroup)[[1]]))
all(year[4] - 50 >= unlist(str_to_seq(agegroup)[[1]]))
all(year[5] - 50 >= unlist(str_to_seq(agegroup)[[1]]))
all(year[6] - 50 >= unlist(str_to_seq(agegroup)[[1]]))

如何改进我的代码以使

statement = all(yearminus50 >= str_to_seq(agegroup))
行正常工作?

非常感谢。

r dplyr vectorization mutate
1个回答
0
投票

问题不在于你的函数,而是期望

all(..)
将与列表列一起使用。我们需要在从
sapply
返回时
str_to-seq
(或类似的)。

但是,如果这是您需要的“全部”,我们可以从

agegroup
中提取最大值并进行比较:

testt |>
  mutate(
    yearminus50 = year - 50,
    statement = yearminus50 >=
      sapply(strsplit(agegroup, "[- ]+"), function(z) max(as.integer(z)))
  )
#      agegroup year yearminus50 statement
# 1 1950 - 1959 1970        1920     FALSE
# 2 1950 - 1959 1980        1930     FALSE
# 3 1950 - 1959 1990        1940     FALSE
# 4 1950 - 1959 2000        1950     FALSE
# 5 1950 - 1959 2010        1960      TRUE
# 6 1950 - 1959 2020        1970      TRUE
© www.soinside.com 2019 - 2024. All rights reserved.