我有每行代表一个家庭的数据,我希望每个人在不同的家庭中都有一行。
数据看起来类似于:
df <- data.frame(village = rep("aaa",5),household_ID = c(1,2,3,4,5),name_1 = c("Aldo","Giovanni","Giacomo","Pippo","Pippa"),outcome_1 = c("yes","no","yes","no","no"),name_2 = c("John","Mary","Cindy","Eva","Doron"),outcome_2 = c("yes","no","no","no","no"))
我仍然想保持数据的宽格式,每行只有一个人(和相关的结果变量)。我可以找到一些示例来说明如何做相反的事情,使用dcast从单个数据到分组数据,但是找不到我现在面临的这个问题的示例。
我尝试过融化
reshape2::melt(df, id.vars = "household_ID")
但是我得到的是长格式数据。
欢迎提出任何建议...
谢谢
使用pivot_longer()
中的tidyr
:
library(tidyr)
df %>%
pivot_longer(-c(village, household_ID),
names_to = c(".value", "n"),
names_sep = "_")
# # A tibble: 10 x 5
# village household_ID n name outcome
# <fct> <dbl> <chr> <fct> <fct>
# 1 aaa 1 1 Aldo yes
# 2 aaa 1 2 John yes
# 3 aaa 2 1 Giovanni no
# 4 aaa 2 2 Mary no
# 5 aaa 3 1 Giacomo yes
# 6 aaa 3 2 Cindy no
# 7 aaa 4 1 Pippo no
# 8 aaa 4 2 Eva no
# 9 aaa 5 1 Pippa no
# 10 aaa 5 2 Doron no
[数据] >>
df <- structure(list(village = structure(c(1L, 1L, 1L, 1L, 1L), .Label = "aaa", class = "factor"),
household_ID = c(1, 2, 3, 4, 5), name_1 = structure(c(1L,
3L, 2L, 5L, 4L), .Label = c("Aldo", "Giacomo", "Giovanni",
"Pippa", "Pippo"), class = "factor"), outcome_1 = structure(c(2L,
1L, 2L, 1L, 1L), .Label = c("no", "yes"), class = "factor"),
name_2 = structure(c(4L, 5L, 1L, 3L, 2L), .Label = c("Cindy",
"Doron", "Eva", "John", "Mary"), class = "factor"), outcome_2 = structure(c(2L,
1L, 1L, 1L, 1L), .Label = c("no", "yes"), class = "factor")), class = "data.frame", row.names = c(NA, -5L))