通过ID创建虚拟变量

问题描述 投票:0回答:1

我的家庭花名册中有数据,如下面的数据框所示

hhroster <- data.frame(HHID = c(1, 1,   1,  2,  2,  3,  3,  3,  3,  4,  4,  4,  5,  5,  6),                     
                    INDID = c(1,    2,  3,  1,  2,  1,  2,  3,  4,  1,  2,  3,  1,  2,  1),
                    response_1 = c("yes",   "no",   "yes",  "yes",  "no",   "no",   "no",   "no",   "no",   "yes",  "yes",  "no",   "yes",  "yes",  "no"),
                    response_2 = c("no",    "no",   "yes",  "no",   "no",   "no",   "yes",  "no",   "no",   "no",   "no",   "no",   "yes",  "yes",  "no"))

并希望在家庭层面创建一个虚拟变量,其值为 1,表明个人至少有一个“是”的回答。所需的输出是

hh <- data.frame(HHID = c(1,    2,  3,  4,  5,  6),
                       HH_response_1 = c(1, 1,  0,  1,  1,  0),
                       HH_response_2 = c(1, 0,  1,  0,  1,  0))

提前致谢!

r dplyr
1个回答
0
投票

使用

dplyr
包,你可以做这样的事情

library(dplyr)

hh <- hhroster |>
  mutate(
    response_1_logic = ifelse(response_1 == "yes", TRUE, FALSE),
    response_2_logic = ifelse(response_2 == "yes", TRUE, FALSE)
  ) |>
  summarise(
    HH_response_1 = as.numeric(any(response_1_logic)),
    HH_response_2 = as.numeric(any(response_2_logic)),
    .by = HHID
  ) |>
  select(HHID, HH_response_1, HH_response_2)

输出

> hh
  HHID HH_response_1 HH_response_2
1    1             1             1
2    2             1             0
3    3             0             1
4    4             1             0
5    5             1             1
6    6             0             0
© www.soinside.com 2019 - 2024. All rights reserved.