这是我的数据的基本信息,它是一个纵向数据集。 变量为:ID、年龄、性别、Q1AnswerTime1、Q2AnswerTime1、Q3AnswerTime1、Q1AnswerTime2、Q2AnswerTime2、Q3AnswerTime2、Q1AnswerTime3、Q2AnswerTime3、Q3AnswerTime3。
现在,如何将该数据集从宽格式传输到长格式? 我想要的是长格式只包含 7 个变量:ID、年龄、性别、Q1Answer、Q2Answer、Q3Answer 和时间。 Q1Answer、Q2Answer 和 Q3Answer 的值将取决于“时间”变量。
需要明确的是,对于最初的研究,我们有5个人参与这项纵向研究,该研究收集了3年的数据。 每年,每个人都会被问 3 个问题:Q1、Q2、Q3。 所以最后,我们有 12 个宽格式的变量。
更新代码部分:
df <- tibble(
ID = c(1, 2, 3),
Age = c(25, 32, 28),
Gender = c("Male", "Female", "Male"),
Q1AnswerTime1 = c(10, 15, 12),
Q2AnswerTime1 = c(7, 9, 8),
Q3AnswerTime1 = c(5, 6, 4),
Q1AnswerTime2 = c(11, 16, 13),
Q2AnswerTime2 = c(8, 10, 9),
Q3AnswerTime2 = c(6, 7, 5),
Q1AnswerTime3 = c(12, 17, 14),
Q2AnswerTime3 = c(9, 11, 10),
Q3AnswerTime3 = c(7, 8, 6)
)
df
预期的输出将是这样的:
dfLong <- tibble(
ID = c(1,1,1,2,2,2, 3,3,3),
Age = c(25,25,25,32,32, 32, 28,28,28),
Gender = c("Male","Male","Male", "Female","Female","Female", "Male","Male","Male"),
Q1 = c(10,11,12,15,16,17,12,13,14),
Q2 = c(7,8,9,9,10,11,8,9,10),
Q3 = c(5,6,7,6,7,8,4,5,6),
Time = c(1,2,3,1,2,3,1,2,3)
)
我尝试在 R 中使用 tidyr 函数,但这个例子对我来说太复杂了,你们能帮我吗?
library(tidyverse)
# pivot the data from wide to long format
df %>%
pivot_longer(
cols = starts_with("Q"),
names_to = c(".value", "Time"),
names_sep = "AnswerTime") %>%
select(-Time, Time)
# A tibble: 9 × 7
ID Age Gender Q1 Q2 Q3 Time
<dbl> <dbl> <chr> <dbl> <dbl> <dbl> <chr>
1 1 25 Male 10 7 5 1
2 1 25 Male 11 8 6 2
3 1 25 Male 12 9 7 3
4 2 32 Female 15 9 6 1
5 2 32 Female 16 10 7 2
6 2 32 Female 17 11 8 3
7 3 28 Male 12 8 4 1
8 3 28 Male 13 9 5 2
9 3 28 Male 14 10 6 3