将宽数据集和长数据集合并到 R 中的宽数据集中(通过将某些列标题和某些单元格组合成新的列标题)

问题描述 投票:0回答:1

我有 2 个数据集

  1. 参与者表现:1 参与者/行
  2. 参与者评分:>1 参与者/行(1 位评分者/行)

我需要将 2 个数据集合并为 1 个数据集,其中包含 1 个参与者/行(包括 NA)。

我已经包含了我拥有的 2 个数据集的可复制玩具示例以及我在下面需要的合并数据集的可复制玩具示例。

pPerformance <- data.frame(Participant = c(1, 2, 3, 4, 5),
                           Session = c("A", "B", "B", "A", "C"),
                           Group = c(1, 1, 2, 1, 1),
                           Answer1 = c("incorrect", "correct", "correct", "incorrect", "correct"),
                           Condition = c("chat", "essay", "essay", "chat", "essay"),
                           Answer2 = c("correct", "correct", "correct", "incorrect", "incorrect"),
                           Duration = c(96, 43, 56, 75, 23)
                           )
pPerformance
  Participant Session Group   Answer1 Condition   Answer2 Duration
1           1       A     1 incorrect      chat   correct       96
2           2       B     1   correct     essay   correct       43
3           3       B     2   correct     essay   correct       56
4           4       A     1 incorrect      chat incorrect       75
5           5       C     1   correct     essay incorrect       23

pRatings <- data.frame(Participant = c(1, 2, 3, 4, 5, 1, 2, 3, 4, 5),
                       Session = c("A", "B", "B", "A", "C", "A", "B", "B", "A", "C"),
                       Group = c(1, 1, 2, 1, 1, 1, 1, 2, 1, 1),
                       Rater = c("Fran", "Fran", "Fran", "Fran", "Fran",
                                 "Fred", "Fred", "Fred", "Fred", "Fred"),
                       Rating1 = c("Yes", "No", "Yes", "Yes", "No", "No", "Yes", "Yes", "Yes", NA),
                       Rating2 = c(3, 0, 1, 2, 0, 2, 1, 1, 2, 1)
                       )
pRatings
   Participant Session Group Rater Rating1 Rating2
1            1       A     1  Fran     Yes       3
2            2       B     1  Fran      No       0
3            3       B     2  Fran     Yes       1
4            4       A     1  Fran     Yes       2
5            5       C     1  Fran      No       0
6            1       A     1  Fred      No       2
7            2       B     1  Fred     Yes       1
8            3       B     2  Fred     Yes       1
9            4       A     1  Fred     Yes       2
10           5       C     1  Fred      NA       1

pMerged <- data.frame(Participant = c(1, 2, 3, 4, 5),
                      Session = c("A", "B", "B", "A", "C"),
                      Group = c(1, 1, 2, 1, 1),
                      Answer1 = c("incorrect", "correct", "correct", "incorrect", "correct"),
                      Condition = c("chat", "essay", "essay", "chat", "essay"),
                      Answer2 = c("correct", "correct", "correct", "incorrect", "incorrect"),
                      Duration = c(96, 43, 56, 75, 23),
                      Fran_Rating1 = c("Yes", "No", "Yes", "Yes", "No"),
                      Fred_Rating1 = c("No", "Yes", "Yes", "Yes", NA),
                      Fran_Rating2 = c(3, 0, 1, 2, 0),
                      Fred_Rating2 = c(2, 1, 1, 2, 1)
                      )
pMerged
  Participant Session Group   Answer1 Condition   Answer2 Duration Fran_Rating1 Fred_Rating1
1           1       A     1 incorrect      chat   correct       96          Yes           No
2           2       B     1   correct     essay   correct       43           No          Yes
3           3       B     2   correct     essay   correct       56          Yes          Yes
4           4       A     1 incorrect      chat incorrect       75          Yes          Yes
5           5       C     1   correct     essay incorrect       23           No           NA
  Fran_Rating2 Fred_Rating2
1            3            2
2            0            1
3            1            1
4            2            2
5            0            1

目前,我正在按“参与者”合并,但这导致每个参与者有多行(10 行而不是玩具示例中的 5 行)。 我不确定如何按照需要组合的方式组合“Rater”单元格和“Rating...”列标题(例如,上面“pMerged”数据框中的“Fran_Rating1”)。

pMergedReproducibly <- merge(pPerformance, pRatings, by = c("Participant"))      
pMergedReproducibly
   Participant Session.x Group.x   Answer1 Condition   Answer2 Duration Session.y Group.y Rater
1            1         A       1 incorrect      chat   correct       96         A       1  Fran
2            1         A       1 incorrect      chat   correct       96         A       1  Fred
3            2         B       1   correct     essay   correct       43         B       1  Fred
4            2         B       1   correct     essay   correct       43         B       1  Fran
5            3         B       2   correct     essay   correct       56         B       2  Fred
6            3         B       2   correct     essay   correct       56         B       2  Fran
7            4         A       1 incorrect      chat incorrect       75         A       1  Fred
8            4         A       1 incorrect      chat incorrect       75         A       1  Fran
9            5         C       1   correct     essay incorrect       23         C       1  Fran
10           5         C       1   correct     essay incorrect       23         C       1  Fred
   Rating1 Rating2
1      Yes       3
2       No       2
3      Yes       1
4       No       0
5      Yes       1
6      Yes       1
7      Yes       2
8      Yes       2
9       No       0
10      NA       1

非常欢迎您的建议!

r merge casting rstudio reshape
1个回答
1
投票
pPerformance %>% 
   left_join(pivot_wider(pRatings, names_from = Rater,
                          values_from = starts_with('Rating')))

  Participant Session Group   Answer1 Condition   Answer2 Duration Rating1_Fran Rating1_Fred Rating2_Fran Rating2_Fred
1           1       A     1 incorrect      chat   correct       96          Yes           No            3            2
2           2       B     1   correct     essay   correct       43           No          Yes            0            1
3           3       B     2   correct     essay   correct       56          Yes          Yes            1            1
4           4       A     1 incorrect      chat incorrect       75          Yes          Yes            2            2
5           5       C     1   correct     essay incorrect       23           No         <NA>            0            1

请注意,如果您需要命名的顺序与您所做的相同,请在您的

pivot_wider
函数中包含以下内容

  names_glue = "{Rater}_{.value}"
© www.soinside.com 2019 - 2024. All rights reserved.