从每组的一行分解为每个主题的一行

问题描述 投票:0回答:1

我有每行代表一个家庭的数据,我希望每个人在不同的家庭中都有一行。

数据看起来类似于:

df <- data.frame(village = rep("aaa",5),household_ID = c(1,2,3,4,5),name_1 = c("Aldo","Giovanni","Giacomo","Pippo","Pippa"),outcome_1 = c("yes","no","yes","no","no"),name_2 = c("John","Mary","Cindy","Eva","Doron"),outcome_2 = c("yes","no","no","no","no"))

我仍然想保持数据的宽格式,每行只有一个人(和相关的结果变量)。我可以找到一些示例来说明如何做相反的事情,使用dcast从单个数据到分组数据,但是找不到我现在面临的这个问题的示例。

我尝试过融化

reshape2::melt(df, id.vars = "household_ID")

但是我得到的是长格式数据。

欢迎提出任何建议...

谢谢

r reshape
1个回答
0
投票

使用pivot_longer()中的tidyr

library(tidyr)

df %>%
  pivot_longer(-c(village, household_ID),
               names_to = c(".value", "n"),
               names_sep = "_")

# # A tibble: 10 x 5
#    village household_ID n     name     outcome
#    <fct>          <dbl> <chr> <fct>    <fct>  
#  1 aaa                1 1     Aldo     yes    
#  2 aaa                1 2     John     yes    
#  3 aaa                2 1     Giovanni no     
#  4 aaa                2 2     Mary     no     
#  5 aaa                3 1     Giacomo  yes    
#  6 aaa                3 2     Cindy    no     
#  7 aaa                4 1     Pippo    no     
#  8 aaa                4 2     Eva      no     
#  9 aaa                5 1     Pippa    no     
# 10 aaa                5 2     Doron    no  

[数据] >>

df <- structure(list(village = structure(c(1L, 1L, 1L, 1L, 1L), .Label = "aaa", class = "factor"), 
    household_ID = c(1, 2, 3, 4, 5), name_1 = structure(c(1L, 
    3L, 2L, 5L, 4L), .Label = c("Aldo", "Giacomo", "Giovanni", 
    "Pippa", "Pippo"), class = "factor"), outcome_1 = structure(c(2L, 
    1L, 2L, 1L, 1L), .Label = c("no", "yes"), class = "factor"), 
    name_2 = structure(c(4L, 5L, 1L, 3L, 2L), .Label = c("Cindy", 
    "Doron", "Eva", "John", "Mary"), class = "factor"), outcome_2 = structure(c(2L, 
    1L, 1L, 1L, 1L), .Label = c("no", "yes"), class = "factor")), class = "data.frame", row.names = c(NA, -5L))
© www.soinside.com 2019 - 2024. All rights reserved.