如何将列值和列名复制为唯一ID值并在r中维护单元格信息?

问题描述 投票:0回答:1

我有一个宽格式数据框,我希望通过缩短列来使其变长,但无法弄清楚如何让pivot_longer()和paste()正常协同工作。

这是一个示例数据集:

df <- data.frame(Site_Number  = c(1, 1, 1, 2),
                 Subsite_Number = c(1, 2, 3, 1),
                  C1 = c("Red", "Blue", "Green", "Red"),
                  C2 = c("Red", "Red", "Red", "Red"),
                  C3 = c("Blue", "Green", "NA", "Blue"),
                  C4 = c("Red", "NA", "NA", "NA"))

看起来像:

  Site_Number Subsite_Number    C1  C2    C3  C4
           1              1   Red Red  Blue Red
           1              2  Blue Red Green  NA
           1              3 Green Red    NA  NA
           2              1   Red Red  Blue  NA

我想根据 Site_Number、Subsite_Number 和“C”值(列名称)创建一个唯一 ID,将数据帧设置为长格式,但保留 C 列中的值。诀窍在于,某些“C”列中存在 NA,无需将其转换为新数据集。这是我希望的输出:

     ID  Color
 S1T1C1   Red
 S1T1C2   Red
 S1T1C3  Blue
 S1T1C4   Red
 S1T2C1  Blue
 S1T2C2   Red
 S1T2C3 Green
 S1T3C1 Green
 S1T3C2   Red
 S2T1C1   Red
 S2T1C2   Red
 S2T1C3  Blue

S = Site_Number,T = Subsite_Number,C 来自 C1:4 的列名来创建 ID 号。

关于如何正确操作它有什么想法吗?

r rstudio data-manipulation
1个回答
0
投票
df |>
  mutate(ID = paste0("S", Site_Number, "T", Subsite_Number)) |>
  select(-Site_Number, -Subsite_Number) |>
  pivot_longer(-ID) |>
  filter(value != "NA" & !is.na(value)) |>
  mutate(ID = paste0(ID, name))
# # A tibble: 12 × 3
#    ID     name  value
#    <chr>  <chr> <chr>
#  1 S1T1C1 C1    Red  
#  2 S1T1C2 C2    Red  
#  3 S1T1C3 C3    Blue 
#  4 S1T1C4 C4    Red  
#  5 S1T2C1 C1    Blue 
#  6 S1T2C2 C2    Red  
#  7 S1T2C3 C3    Green
#  8 S1T3C1 C1    Green
#  9 S1T3C2 C2    Red  
# 10 S2T1C1 C1    Red  
# 11 S2T1C2 C2    Red  
# 12 S2T1C3 C3    Blue 
© www.soinside.com 2019 - 2024. All rights reserved.