在 R 中重塑数据帧以分隔字符串

问题描述 投票:0回答:1

我在 R 中有这个数据框

 library(dplyr)
    library(tidyr)


df <- data.frame(
  ID = 1,
  `Zebra fish (one)` = 3,
  `Zebra fish (two)` = 4,
  `Dog-caut (zero)` = 9,
  `Dog-caut (hello there)` = 12
)

并尝试制作这个,但是正如您所看到的,类型列总是显示为空,我该如何修复它?

# Reshaping the dataframe
long_df <- df %>%
  pivot_longer(
    cols = -ID, 
    names_to = "CATEGORY_TYPE", 
    values_to = "SCORE"
  ) %>%
  separate(CATEGORY_TYPE, into = c("CATEGORY", "TYPE"), sep = " \\(") %>%
  mutate(TYPE = sub("\\)", "", TYPE))

数据框应该是这样的,

ID, CATEGORY, TYPE, SCORE
1, Zebra fish, one, 3
1, Zebra fish, two, 4
1, Dog-caut, zero, 9
1, Dog-caut, hello there, 12
r dplyr tidyr grepl
1个回答
1
投票

这可能是何时使用

tidyr::separate_wider_delim
的一个很好的例子:

df %>%
  pivot_longer(
    cols = -ID, 
    names_to = "CATEGORY_TYPE", 
    values_to = "SCORE"
  ) %>%
  separate_wider_delim(CATEGORY_TYPE, delim = "..", names = c("CATEGORY", "TYPE"))

#     ID CATEGORY   TYPE         SCORE
#   <dbl> <chr>      <chr>        <dbl>
# 1     1 Zebra.fish one.             3
# 2     1 Zebra.fish two.             4
# 3     1 Dog.caut   zero.            9
# 4     1 Dog.caut   hello.there.    12

如果您想通过删除句点来清理它,请添加额外的

mutate

df %>%
  pivot_longer(
    cols = -ID, 
    names_to = "CATEGORY_TYPE", 
    values_to = "SCORE"
  ) %>%
  separate_wider_delim(CATEGORY_TYPE, delim = "..", names = c("CATEGORY", "TYPE")) %>%
  mutate(across(everything(), ~trimws(gsub("\\.", " ", .x))))

#   ID    CATEGORY   TYPE        SCORE
#   <chr> <chr>      <chr>       <chr>
# 1 1     Zebra fish one         3    
# 2 1     Zebra fish two         4    
# 3 1     Dog caut   zero        9    
# 4 1     Dog caut   hello there 12   
© www.soinside.com 2019 - 2024. All rights reserved.