我的家庭花名册中有数据,如下面的数据框所示
hhroster <- data.frame(HHID = c(1, 1, 1, 2, 2, 3, 3, 3, 3, 4, 4, 4, 5, 5, 6),
INDID = c(1, 2, 3, 1, 2, 1, 2, 3, 4, 1, 2, 3, 1, 2, 1),
response_1 = c("yes", "no", "yes", "yes", "no", "no", "no", "no", "no", "yes", "yes", "no", "yes", "yes", "no"),
response_2 = c("no", "no", "yes", "no", "no", "no", "yes", "no", "no", "no", "no", "no", "yes", "yes", "no"))
并希望在家庭层面创建一个虚拟变量,其值为 1,表明个人至少有一个“是”的回答。所需的输出是
hh <- data.frame(HHID = c(1, 2, 3, 4, 5, 6),
HH_response_1 = c(1, 1, 0, 1, 1, 0),
HH_response_2 = c(1, 0, 1, 0, 1, 0))
提前致谢!
使用
dplyr
包,你可以做这样的事情
library(dplyr)
hh <- hhroster |>
mutate(
response_1_logic = ifelse(response_1 == "yes", TRUE, FALSE),
response_2_logic = ifelse(response_2 == "yes", TRUE, FALSE)
) |>
summarise(
HH_response_1 = as.numeric(any(response_1_logic)),
HH_response_2 = as.numeric(any(response_2_logic)),
.by = HHID
) |>
select(HHID, HH_response_1, HH_response_2)
输出
> hh
HHID HH_response_1 HH_response_2
1 1 1 1
2 2 1 0
3 3 0 1
4 4 1 0
5 5 1 1
6 6 0 0