pivot_longer:names_to和names_pattern

问题描述 投票:0回答:1

我有一个宽数据框。

Year <- c(2020, 2021)
Percent_a <- c(10,10)
Percent_b <- c(12,10)
Percent_c <- c(2,4)
Percent_d <- c(4,5)

df <- data.frame(Year, Percent_a, Percent_b, Percent_c, Percent_d)

我希望我的数据采用以下格式:

Year  Item  Percent
2020  a     10
2020  b     12
2020  c     2
2020  d      4
2021  a     10
2021  b     10
2021  c     4
2021  d     5

我试过这个:

df %>%
  pivot_longer(
    cols = -Year,
    names_to = c(".value", "Percent"),
    names_pattern = "(.)_(.*)",
    values_to = "Percentage"
  ) ->df_longer

它几乎成功了,但我得到了类似的东西 - 什么是“t”?

Year  Percent  t
2020  a        10
2020  b        12
2020  c        2
2020  d        4
2021  a        10
2021  b        10
2021  c        4
2021  d        5
r
1个回答
0
投票

t
来自Percent末尾的“t”,它是下划线之前的单个字符。因此,您的正则表达式组需要扩展以获得完整的单词“Percent”,而不仅仅是最后一个字符。

尝试:

df %>%
  pivot_longer(
    cols = -Year,
    names_to = c(".value", "Item"),
    names_pattern = "(.*)_(.)"
  )

输出:

# A tibble: 8 × 3
   Year Item  Percent
  <dbl> <chr>   <dbl>
1  2020 a          10
2  2020 b          12
3  2020 c           2
4  2020 d           4
5  2021 a          10
6  2021 b          10
7  2021 c           4
8  2021 d           5
© www.soinside.com 2019 - 2024. All rights reserved.