如何使用 dplyr 选择共享常见但未知字符串元素的列？

Question

假设我有一些数据，其中有基于一些先前操作的计数和百分比列，其中列名称与数据相关。我的问题是，当我由于其数据相关而事先不知道该字符串时，如何选择包含公共字符串的列？一些玩具数据，我们确实知道列名称（但让我们假装我们不知道！）

library(tidyverse)

mtcars <- mtcars %>% 
    mutate(cnt_something = sample(0:100, nrow(mtcars)),
           cnt_otherthing = sample(0:100, nrow(mtcars)),
           pct_something = paste0( cnt_something, "%"),
           pct_otherthing = paste0( cnt_otherthing, "%"))

因此，在实际数据中，字符串

something

和

otherthing

是由之前的数据相关步骤产生的，实际上可能有很多列，但我知道总会有一对形式为

cnt_

和

pct_

的列。因此，我的问题是，在不知道不同可能的 ***** 字符串的情况下，如何选择与 ***** 匹配的

cnt_*****

和

pct_****

进行下一个操作（例如

paste0()

）。

期望的输出是这样的：

                   result_something result_otherthing
Mazda RX4                   49 (49%)          82 (82%)
Mazda RX4 Wag               20 (20%)          72 (72%)
Datsun 710                  37 (37%)          75 (75%)
Hornet 4 Drive              22 (22%)          85 (85%)
Hornet Sportabout           53 (53%)        100 (100%)

Answer 1

这是一种方法，使用 rlang 来解析构造的表达式

library(tidyverse)
library(rlang)

mtcars <- mtcars |> select() |>  
  mutate(cnt_something = sample(0:100, nrow(mtcars)),
         cnt_otherthing = sample(0:100, nrow(mtcars)),
         pct_something = paste0( cnt_something, "%"),
         pct_otherthing = paste0( cnt_otherthing, "%"))


funky_funks <- function(data,
         prefix_1,
         prefix_2,
         prefix_res){
nm1 <- names(mtcars)
(nm2 <- nm1[startsWith(nm1,prefix_1)])
(res_names <- str_replace(nm2,fixed(prefix_1),prefix_res))

plist <- map(nm2,\(n1){
    n2 <- str_replace(n1,prefix_1,prefix_2)
    myexpr <- paste0('paste0(',n1,'," (",',n2,',")")')
   rlang::parse_expr(myexpr)
}) |> set_names(res_names)

data |> mutate(!!!plist)
}


funky_funks(mtcars,
            prefix_1 = "cnt_",
            prefix_2 = "pct_",
            prefix_res = "result_")

如何使用 dplyr 选择共享常见但未知字符串元素的列？

问题描述投票：0回答：1

1个回答

最新问题

如何使用 dplyr 选择共享常见但未知字符串元素的列？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1