我的 adsl 有几个数字变量,我想对它们进行汇总以获得汇总数据集。下面是我的代码,我以 AGE 和 STAGEN 变量为例:
adsl <- adsl %>% group_by(ACTARM)
AGE <- summarise(adsl,
param='AGE',
mean=mean(AGE, na.rm = TRUE),
sd = sd(AGE, na.rm = TRUE),
median = quantile(AGE, probs = 0.5),
min = min(AGE),
max = max(AGE)
)
STAGEN <- summarise(adsl,
param='STAGEN',
mean=mean(STAGEN, na.rm = TRUE),
sd = sd(STAGEN, na.rm = TRUE),
median = quantile(STAGEN, probs = 0.5),
min = min(STAGEN),
max = max(STAGEN)
)
result <- rbind(AGE, STAGEN)
我的问题是如何编写一个for循环来替换重复的代码
for (i in c('AGE', 'STAGEN')){
i <- adsl %>%
summarise(
param=i,
mean=mean(adsl[i], na.rm = TRUE),
sd = sd(adsl[i], na.rm = TRUE),
median = quantile(adsl[i], probs = 0.5),
min = min(adsl[i]),
max = max(adsl[i])
)
}
希望有人能帮助我
我想把我的问题说清楚。我准备了一个adsl数据集。我想获得一个摘要数据集。
当我用 AGE 检查我的函数时,它运行良好,但是如果我想使用 for 循环处理多个变量,输出中的 param 列显示“i”,实际上我希望它显示“AGE”,“HEIGHT”, '重量'
SUBJID <- c(paste0('S', 1:10))
ACTARM <- c(paste0('TRT', rep(1:2, 5)))
AGE <- c(23:32)
HEIGHT <- c(160:169)
WEIGHT <- c(60:69)
adsl <- tibble(
SUBJID=SUBJID,
ACTARM=ACTARM,
AGE=AGE,
WEIGHT=WEIGHT,
HEIGHT=HEIGHT
)
sum_num <- function(data, var, group_var = ACTARM ){
var_name <- tail(as.character(substitute(var)), 1)
data %>% group_by({{group_var}}) %>%
summarize(
param=var_name ,
n = sum(!is.na({{var}})),
mean = mean({{var}}, na.rm = TRUE),
sd = sd({{var}}, na.rm = TRUE),
median = quantile({{var}}, probs = 0.5, type = 3),
min = min({{var}})
)
}
#check the function with one varaible
chk <- sum_num(adsl, AGE)
#check the function with multiple variables
output <- tibble()
for (i in c('AGE', 'HEIGHT', "WEIGHT")){
temp <- sum_num(adsl, .data[[i]])
output <- rbind(output, temp)
}