旋转更长的两组具有相同后缀集的列,主要是正确的正则表达式的问题

问题描述 投票:0回答:1

我有一个数据框,其中包含以下内容:

borg_dx、borg_sx、borg_dominant、borg_nondominant

其中包含右侧、左侧、主导侧和非主导侧的博格比例的整数值。

然后我有另一组列,称为:

borg_category_dx,borg_category_sx,borg_category_dominant,borg_category_nondominant,

具有博格类别的角色值(不努力,有些努力,等等。)

我想将数据框转换为长格式,其中一列是侧面,一列是 borg 值,一列是 borg 类别,如下所示:

df1 <- data.frame(side = c(dx, sx, dom, nondom, dx, sx, dom, nondom), borg_value = c(0,1,3,1,6,1,1,4,0,0,8), borg_category = c(no effort, some effort, no effort, no effort, max effort, no effort, some effort, some effort))

如果我只是想调整 borg 值或类别,我可以这样做:

pivot_longer(
      cols = starts_with("borg"),
      names_to = "side",
      names_prefix = "borg_",
      values_to = "borg_value")

但是必须将所有后缀(dx、sx、主导、非主导)发送到称为 side 的一列,并将值发送到两列,我从其他答案中发现我应该使用类似的内容:

 pivot_longer(
  cols = starts_with("borg"),
  names_to = c(".value", "side"),
  names_prefix = "borg",
  names_pattern = ?????? 

除了我不知道如何表达正则表达式位。我不知道是否应该重命名列以仅使用一个下划线和不同的分隔符。

r regex pivot
1个回答
0
投票

从您的描述看来,您的数据如下所示。

dput()
用于复制的输出:

df0 <- structure(list(borg_dx = c( 0L, 2L, 2L, 7L, 5L, 6L, 0L, 2L, 10L, 7L, 5L, 7L, 3L, 4L, 3L, 9L, 5L, 1L, 4L, 5L, 9L, 10L, 3L, 6L, 0L, 3L, 7L, 5L, 0L, 10L, 7L, 2L, 4L, 8L, 0L, 8L, 3L, 2L, 5L, 7L, 5L, 2L, 1L, 4L, 5L, 6L, 9L, 1L, 9L, 10L ), 
                       borg_sx = c( 9L, 8L, 0L, 0L, 4L, 9L, 10L, 2L, 3L, 4L, 10L, 10L, 10L, 0L, 3L, 6L, 2L, 10L, 4L, 8L, 9L, 6L, 1L, 6L, 9L, 3L, 5L, 1L, 5L, 5L, 7L, 3L, 2L, 6L, 4L, 0L, 2L, 0L, 0L, 4L, 3L, 0L, 7L, 10L, 5L, 7L, 9L, 0L, 0L, 9L ), 
                       borg_dominant = c( 0L, 10L, 10L, 5L, 1L, 6L, 0L, 8L, 9L, 4L, 7L, 0L, 1L, 6L, 3L, 10L, 5L, 0L, 4L, 4L, 6L, 2L, 3L, 9L, 4L, 1L, 5L, 10L, 9L, 8L, 4L, 3L, 10L, 7L, 6L, 4L, 10L, 1L, 7L, 8L, 2L, 0L, 8L, 9L, 1L, 8L, 9L, 9L, 6L, 5L ), 
                       borg_nondominant = c( 8L, 3L, 6L, 4L, 1L, 6L, 4L, 3L, 1L, 6L, 4L, 9L, 6L, 1L, 2L, 8L, 8L, 6L, 1L, 0L, 5L, 5L, 0L, 8L, 6L, 8L, 10L, 8L, 4L, 2L, 6L, 1L, 10L, 7L, 2L, 4L, 5L, 7L, 5L, 8L, 5L, 7L, 5L, 8L, 6L, 7L, 2L, 10L, 1L, 3L ), 
                       borg_category_dx = c( "Rest", "Really Easy", "Really Easy", "Really Hard", "Hard", "Really Hard", "Rest", "Really Easy", "Maximal: Just like my hardest race", "Really Hard", "Hard", "Really Hard", "Moderate", "Sort of Hard", "Moderate", "Really, Really, Hard", "Hard", "Rest", "Sort of Hard", "Hard", "Really, Really, Hard", "Maximal: Just like my hardest race", "Moderate", "Really Hard", "Rest", "Moderate", "Really Hard", "Hard", "Rest", "Maximal: Just like my hardest race", "Really Hard", "Really Easy", "Sort of Hard", "Really, Really, Hard", "Rest", "Really, Really, Hard", "Moderate", "Really Easy", "Hard", "Really Hard", "Hard", "Really Easy", "Rest", "Sort of Hard", "Hard", "Really Hard", "Really, Really, Hard", "Rest", "Really, Really, Hard", "Maximal: Just like my hardest race" ), 
                       borg_category_sx = c( "Really, Really, Hard", "Really, Really, Hard", "Rest", "Rest", "Sort of Hard", "Really, Really, Hard", "Maximal: Just like my hardest race", "Really Easy", "Moderate", "Sort of Hard", "Maximal: Just like my hardest race", "Maximal: Just like my hardest race", "Maximal: Just like my hardest race", "Rest", "Moderate", "Really Hard", "Really Easy", "Maximal: Just like my hardest race", "Sort of Hard", "Really, Really, Hard", "Really, Really, Hard", "Really Hard", "Rest", "Really Hard", "Really, Really, Hard", "Moderate", "Hard", "Rest", "Hard", "Hard", "Really Hard", "Moderate", "Really Easy", "Really Hard", "Sort of Hard", "Rest", "Really Easy", "Rest", "Rest", "Sort of Hard", "Moderate", "Rest", "Really Hard", "Maximal: Just like my hardest race", "Hard", "Really Hard", "Really, Really, Hard", "Rest", "Rest", "Really, Really, Hard" ), 
                       borg_category_dominant = c( "Rest", "Maximal: Just like my hardest race", "Maximal: Just like my hardest race", "Hard", "Rest", "Really Hard", "Rest", "Really, Really, Hard", "Really, Really, Hard", "Sort of Hard", "Really Hard", "Rest", "Rest", "Really Hard", "Moderate", "Maximal: Just like my hardest race", "Hard", "Rest", "Sort of Hard", "Sort of Hard", "Really Hard", "Really Easy", "Moderate", "Really, Really, Hard", "Sort of Hard", "Rest", "Hard", "Maximal: Just like my hardest race", "Really, Really, Hard", "Really, Really, Hard", "Sort of Hard", "Moderate", "Maximal: Just like my hardest race", "Really Hard", "Really Hard", "Sort of Hard", "Maximal: Just like my hardest race", "Rest", "Really Hard", "Really, Really, Hard", "Really Easy", "Rest", "Really, Really, Hard", "Really, Really, Hard", "Rest", "Really, Really, Hard", "Really, Really, Hard", "Really, Really, Hard", "Really Hard", "Hard" ), 
                       borg_category_nondominant = c( "Really, Really, Hard", "Moderate", "Really Hard", "Sort of Hard", "Rest", "Really Hard", "Sort of Hard", "Moderate", "Rest", "Really Hard", "Sort of Hard", "Really, Really, Hard", "Really Hard", "Rest", "Really Easy", "Really, Really, Hard", "Really, Really, Hard", "Really Hard", "Rest", "Rest", "Hard", "Hard", "Rest", "Really, Really, Hard", "Really Hard", "Really, Really, Hard", "Maximal: Just like my hardest race", "Really, Really, Hard", "Sort of Hard", "Really Easy", "Really Hard", "Rest", "Maximal: Just like my hardest race", "Really Hard", "Really Easy", "Sort of Hard", "Hard", "Really Hard", "Hard", "Really, Really, Hard", "Hard", "Really Hard", "Hard", "Really, Really, Hard", "Really Hard", "Really Hard", "Really Easy", "Maximal: Just like my hardest race", "Rest", "Moderate" ) ), 
                  class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,-50L))

作为小标题:

dplyr::as_tibble(df0)
#> # A tibble: 50 × 8
#>    borg_dx borg_sx borg_dominant borg_nondominant borg_category_dx              
#>      <int>   <int>         <int>            <int> <chr>                         
#>  1       0       9             0                8 Rest                          
#>  2       2       8            10                3 Really Easy                   
#>  3       2       0            10                6 Really Easy                   
#>  4       7       0             5                4 Really Hard                   
#>  5       5       4             1                1 Hard                          
#>  6       6       9             6                6 Really Hard                   
#>  7       0      10             0                4 Rest                          
#>  8       2       2             8                3 Really Easy                   
#>  9      10       3             9                1 Maximal: Just like my hardest…
#> 10       7       4             4                6 Really Hard                   
#> # ℹ 40 more rows
#> # ℹ 3 more variables: borg_category_sx <chr>, borg_category_dominant <chr>,
#> #   borg_category_nondominant <chr>

name_prefix
pivot_longer()
参数删除之前的前缀 应用
name_pattern
。在这种情况下,前缀有助于告诉 borg 值和 borg 类别分开。在
name_pattern
中我们使用括号 定义两个提取组;第一个“(.*)”捕获了一切 直到第二个下划线,将下划线之后的所有内容留给第二组。

library(tidyr)

df0 |>
  pivot_longer(
    cols = starts_with("borg"),
    names_to = c(".value", "side"),
    names_pattern = "(.*)_(.*)"
  )
#> # A tibble: 200 × 3
#>    side         borg borg_category                     
#>    <chr>       <int> <chr>                             
#>  1 dx              0 Rest                              
#>  2 sx              9 Really, Really, Hard              
#>  3 dominant        0 Rest                              
#>  4 nondominant     8 Really, Really, Hard              
#>  5 dx              2 Really Easy                       
#>  6 sx              8 Really, Really, Hard              
#>  7 dominant       10 Maximal: Just like my hardest race
#>  8 nondominant     3 Moderate                          
#>  9 dx              2 Really Easy                       
#> 10 sx              0 Rest                              
#> # ℹ 190 more rows
© www.soinside.com 2019 - 2024. All rights reserved.