我正在学习使用出色的“ expss” R包。
我需要知道是否可以使用此程序包在多选变量和分类变量之间建立列联表,考虑权重变量
在此数据框中,类别变量为“性”,权重变量为“ survey_weight”:
demo <- tribble(
~dummy1, ~dummy2, ~dummy3, ~survey_weight, ~sex,
1, 0, 0, 1.5, "male",
1, 1, 0, 1.5, "female",
1, 1, 1, .5, "female",
0, 1, 1, 1.5, "male",
1, 1, 1, .5, "male",
0, 0, 1, .5, "male",
)
demo
我需要根据回答问题的总受访者而不是总答复来计算百分比。
提前感谢!
library(expss)
demo = text_to_columns('
dummy1 dummy2 dummy3 survey_weight sex
1 0 0 1.5 male
1 1 0 1.5 female
1 1 1 .5 female
0 1 1 1.5 male
1 1 1 .5 male
0 0 1 .5 male
')
demo %>%
tab_cells(mdset(dummy1 %to% dummy3)) %>% # 'mdset' designate that with have multiple dichotomy set
tab_cols(sex) %>% # columns
tab_weight(survey_weight) %>% # weight
tab_stat_cpct() %>% # statistic
tab_pivot()
# | | sex | |
# | | female | male |
# | ------------ | ------ | ---- |
# | dummy1 | 100 | 50.0 |
# | dummy2 | 100 | 50.0 |
# | dummy3 | 25 | 62.5 |
# | #Total cases | 2 | 4.0 |
# shorter notation with the same result
calc_cro_cpct(demo, mdset(dummy1 %to% dummy3), sex, weight = survey_weight)
也许我们可以使用cro_cpct
library(expss)
calculate(demo, cro_cpct(list(dummy1, dummy2, dummy3), weight = survey_weight, sex))
#
# | | sex | |
# | | female | male |
# | ------------ | ------ | ---- |
# | 0 | | 50.0 |
# | 1 | 100 | 50.0 |
# | #Total cases | 2 | 4.0 |
# | 0 | | 50.0 |
# | 1 | 100 | 50.0 |
# | #Total cases | 2 | 4.0 |
# | 0 | 75 | 37.5 |
# | 1 | 25 | 62.5 |
# | #Total cases | 2 | 4.0 |