我希望对于所有因子数据类型,每次调用分组均值函数时都创建一个新列。
我只能复制决策者的结果,但只能复制单个因子变量A。
df <- data.frame(
target = c(1, 4, 8, 9, 2, 1, 3, 5, 7, 1),
A = c("A", "Z", "N", "A", "Z"),
B = c("B", "Q", "G", "B", "T"),
C = c("C", "Y", "C", "P", "Y")
)
grouped_mean <- function(data, summary_var, ...) {
summary_var <- enquo(summary_var)
data %>%
# Selects only factor data types and a target column
select(which(map_chr(., class) == "factor"), !!summary_var) %>%
group_by(...) %>%
# Over here I am not able to change column name, so that it yields Mean_A, Mean_B and Mean_C
mutate(mean = mean(!!summary_var)) %>%
ungroup()
}
grouped_mean(data = df,
group_var = A,
summary_var = target)
我尝试将其循环:
map_df(df, grouped_mean(data = df, summary_var = target))
但我收到此错误:
错误:无法将
tbl_df/tbl/data.frame
对象转换为函数
问题和意见:
target
指定为您想要的平均值的列)。这仅使用mutate_if()
并使用tapply()
的子集来获取您的均值。 然后,它使用rename_at()
更改名称以匹配所需的输出。如果希望小写,可以用gsub()
tolower()
df %>%
mutate_if(is.factor, list(Mean = ~tapply(df$target, ., mean)[.])) %>%
rename_at(vars(ends_with("Mean")), ~gsub("(.*?)_(.*)", "\\2_\\1", .))
target A B C Mean_A Mean_B Mean_C
1 1 A B C 4.5 4.5 3.75
2 4 Z Q Y 2.5 3.5 2.50
3 8 N G C 6.5 6.5 3.75
4 9 A B P 4.5 4.5 8.00
5 2 Z T Y 2.5 1.5 2.50
6 1 A B C 4.5 4.5 3.75
7 3 Z Q Y 2.5 3.5 2.50
8 5 N G C 6.5 6.5 3.75
9 7 A B P 4.5 4.5 8.00
10 1 Z T Y 2.5 1.5 2.50