我正在尝试为在上学期间接触过支持网络的学生计算累计金额。
一个示例数据帧将是(ID =学生,Term =感兴趣的学期,Support =支持网络的暴露):
df <- data.frame(ID=c(123451, 123451, 123451, 123451, 123452, 123452, 123452, 123452,
123452, 123452, 123452, 123453, 123453, 123453, 123453, 123453, 123453, 123453, 123453),
Term= c(1141, 1148, 1158, 1141, 1158, 1161, 1148, 1151, 1158, 1138,
1141, 1138, 1141, 1141, 1148, 1138, 1148, 1158, 1161),
Support = c(1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1))
由于我正在寻找累积的风险(从他们最早的学期开始),所以我首先按ID和Term对数据进行了排序:df <- df[order(df[,1], df[,2]),]
然后我将Support变量的累积总和计算到单独的列中
df$Dosage <- ave(df[3], df[1], FUN=cumsum)
输出:
ID Term Support Dosage
1 123451 1141 1 1
4 123451 1141 1 2
2 123451 1148 0 2
3 123451 1158 1 3
10 123452 1138 0 0
11 123452 1141 0 0
7 123452 1148 1 1
8 123452 1151 1 2
5 123452 1158 1 3
9 123452 1158 1 4
6 123452 1161 0 4
12 123453 1138 1 1
16 123453 1138 0 1
13 123453 1141 0 1
14 123453 1141 1 2
15 123453 1148 0 2
17 123453 1148 1 3
18 123453 1158 1 4
19 123453 1161 1 5
虽然这很有用,但如果一个学生在同一学期有两行,我希望剂量的值能够反映出最大值。
所以对于Student = 123451和Term = 1141,我希望两个剂量值都等于2。
对于Student = 123452和Term = 1158,我希望两个剂量值都等于4。
对于Student = 123453和Term = 1148,我希望两个剂量值都等于3。
不太了解您的问题所在,但也许您可以尝试以下方法:
library(dplyr)
df <- data.frame(ID=c(123451, 123451, 123451, 123451, 123452, 123452, 123452, 123452,
123452, 123452, 123452, 123453, 123453, 123453, 123453, 123453, 123453, 123453, 123453),
Term= c(1141, 1148, 1158, 1141, 1158, 1161, 1148, 1151, 1158, 1138,
1141, 1138, 1141, 1141, 1148, 1138, 1148, 1158, 1161),
Support = c(1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1))
df %>%
arrange(ID, Term) %>%
group_by(ID) %>%
mutate(Dosage = cumsum(Support)) %>%
ungroup() %>%
group_by(ID, Term) %>%
mutate(Dosage = max(Dosage)) %>%
ungroup()
ID Term Support Dosage
1 123451 1141 1 2
2 123451 1141 1 2
3 123451 1148 0 2
4 123451 1158 1 3
5 123452 1138 0 0
6 123452 1141 0 0
7 123452 1148 1 1
8 123452 1151 1 2
9 123452 1158 1 4
10 123452 1158 1 4
11 123452 1161 0 4
12 123453 1138 1 1
13 123453 1138 0 1
14 123453 1141 0 2
15 123453 1141 1 2
16 123453 1148 0 3
17 123453 1148 1 3
18 123453 1158 1 4
19 123453 1161 1 5