dplyr::mutate 中的向量化函数和逻辑运算符

Question

我正在尝试向量化一个函数以在

dplyr::mutate

中使用。对于我的一生，我无法让它发挥作用。这就是我一直在做的事情：

str_to_seq <- Vectorize(function(x) {
  
  # This function converts text format year ranges (e.g. "1970 - 1979") to 
  # numeric ranges. Handily works with single values and edge cases such as 
  # "- 1920".
  
  res <- stringr::str_extract_all(x, "\\d+") %>% 
    unlist() %>% 
    {seq(dplyr::first(.), dplyr::last(.))}
  
  return(res)
  
}, vectorize.args = "x", SIMPLIFY = F)

year <- c(1970, 1980, 1990, 2000, 2010, 2020)
agegroup <- "1950 - 1959"

testt <- expand.grid(agegroup = agegroup, year = year, stringsAsFactors = F)

testt %>% 
  as_tibble() %>% 
  dplyr::mutate(
    yearminus50 = year - 50,
    statement = all(yearminus50 >= str_to_seq(agegroup)))

statement

列失败并显示错误消息

Error in `dplyr::mutate()`:
ℹ In argument: `statement = all(yearminus50 >= str_to_seq(agegroup))`.
Caused by error:
! 'list' object cannot be coerced to type 'double'
Run `rlang::last_trace()` to see where the error occurred.

我无法让我的函数

str_to_seq

来创建普通向量。输出似乎是一个列表。

statement

应该是

c(FALSE, FALSE, FALSE, FALSE, TRUE, TRUE)

，正如我们通过这个暴力代码所看到的：

all(year[1] - 50 >= unlist(str_to_seq(agegroup)[[1]]))
all(year[2] - 50 >= unlist(str_to_seq(agegroup)[[1]]))
all(year[3] - 50 >= unlist(str_to_seq(agegroup)[[1]]))
all(year[4] - 50 >= unlist(str_to_seq(agegroup)[[1]]))
all(year[5] - 50 >= unlist(str_to_seq(agegroup)[[1]]))
all(year[6] - 50 >= unlist(str_to_seq(agegroup)[[1]]))

如何改进我的代码以使

statement = all(yearminus50 >= str_to_seq(agegroup))

行正常工作？

非常感谢。

Answer 1

问题不在于你的函数，而是期望

all(..)

将与列表列一起使用。我们需要在从

sapply

返回时

str_to-seq

（或类似的）。

但是，如果这是您需要的“全部”，我们可以从

agegroup

中提取最大值并进行比较：

testt |>
  mutate(
    yearminus50 = year - 50,
    statement = yearminus50 >=
      sapply(strsplit(agegroup, "[- ]+"), function(z) max(as.integer(z)))
  )
#      agegroup year yearminus50 statement
# 1 1950 - 1959 1970        1920     FALSE
# 2 1950 - 1959 1980        1930     FALSE
# 3 1950 - 1959 1990        1940     FALSE
# 4 1950 - 1959 2000        1950     FALSE
# 5 1950 - 1959 2010        1960      TRUE
# 6 1950 - 1959 2020        1970      TRUE

dplyr::mutate 中的向量化函数和逻辑运算符

问题描述投票：0回答：1

1个回答

最新问题

dplyr::mutate 中的向量化函数和逻辑运算符

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1