我正在尝试将以下代码从Stata转换为R:
collapse (mean) erate_total_male laborforce_male erate_total_male_1953 laborforce_male_1953 share_expellees_male share_dest_flats instrument share_agric_1939 city_state (max) occzone_occu [aw=laborforce_male], by(bundesland_id_1953 occupation_id)
我试图在R中使用collapse
包,但是我不确定如何结合Stata代码的权重元素或最大值(尽管我可能只是生成一个新变量来解决该问题)。
test1 <- rep_data %>%
mutate(bundesland_id_1953 =
case_when(
bundesland_id == 8 ~ 99,
bundesland_id == 9 ~ 99,
bundesland_id == 10 ~ 99,
)) %>%
group_by(bundesland_id_1953, occupation_id) %>%
select(erate_total_male, laborforce_male, erate_total_male_1953, laborforce_male_1953, share_expellees_male, share_dest_flats, instrument_male, share_agric_1939, city_state, occzone_occu) %>% fmean
我也尝试为所有变量生成均值,但是在添加权重时遇到了相同的问题:
t6Data2 <- rep_data %>%
mutate(bundesland_id_1953 =
case_when(
bundesland_id == 8 ~ 99,
bundesland_id == 9 ~ 99,
bundesland_id == 10 ~ 99,
)) %>%
group_by(bundesland_id_1953, occupation_id) %>% summarise_at(vars(erate_total_male, laborforce_male, erate_total_male_1953, laborforce_male_1953, share_expellees_male, share_dest_flats, instrument_male, share_agric_1939, city_state)
最后,我尝试了一个循环,但是当我使用lm()运行回归时,我的变量没有出现:
test444 <- rep_data %>%
mutate(bundesland_id_1953 =
case_when(
bundesland_id == 8 ~ 99,
bundesland_id == 9 ~ 99,
bundesland_id == 10 ~ 99,
)) %>%
group_by(bundesland_id_1953, occupation_id)
t6_data_test4 <- sapply(c(test444$erate_total_male, test444$laborforce_male, test444$erate_total_male_1953, test444$laborforce_male_1953, test444$share_expellees_male, test444$share_dest_flats, test444$instrument_male, test444$share_agric_1939, test444$city_state), function(x) {
weighted.mean(x, weight = laborforce_male)
})
我不确定该怎么做,但我会提供任何帮助。我是一个相对新手,因此对我在代码中犯的任何明显错误深表歉意。