我的问题是,在附加的图片(链接)解释。
我曾尝试以下没有结果代码:
df[paste0("combined_", df_of_column_names)] <- lapply(df, ave, na.rm =TRUE, df[["index Z"]])
没有返回那里有本的NA的平均值
df[paste0("combined_", df_of_column_names)] <- lapply(df, ave(FUN=function(x) mean(x, na.rm=T)), df[["index Z"]])
这给出了错误:
在FUN(x)的错误:参数“X”的缺失,没有默认值
有人可以帮我吗?非常感谢!
使用基础R我能得到这个一个简单的情况下,类似于你的工作。
attach(warpbreaks)
wool[5] <- NA
df <- data.frame(wool = wool, break = breaks)
df <- cbind(df, df$wool)
df
wool breaks breaks.1
1 A 26 26
2 A 30 30
3 A 54 54
4 A 25 25
5 A NA NA
6 A 52 52
...
lapply(df[,-1], function(x) ave(x, df[,1], FUN = function(x) mean(x, na.rm=TRUE)))
$breaks
[1] 29.53846 29.53846 29.53846 29.53846 29.53846 29.53846 29.53846 29.53846
[9] 29.53846 29.53846 29.53846 29.53846 29.53846 29.53846 29.53846 29.53846
[17] 29.53846 29.53846 29.53846 29.53846 29.53846 29.53846 29.53846 29.53846
[25] 29.53846 29.53846 29.53846 25.25926 25.25926 25.25926 25.25926 25.25926
[33] 25.25926 25.25926 25.25926 25.25926 25.25926 25.25926 25.25926 25.25926
[41] 25.25926 25.25926 25.25926 25.25926 25.25926 25.25926 25.25926 25.25926
[49] 25.25926 25.25926 25.25926 25.25926 25.25926 25.25926
$breaks.1
[1] 29.53846 29.53846 29.53846 29.53846 29.53846 29.53846 29.53846 29.53846
[9] 29.53846 29.53846 29.53846 29.53846 29.53846 29.53846 29.53846 29.53846
[17] 29.53846 29.53846 29.53846 29.53846 29.53846 29.53846 29.53846 29.53846
[25] 29.53846 29.53846 29.53846 25.25926 25.25926 25.25926 25.25926 25.25926
[33] 25.25926 25.25926 25.25926 25.25926 25.25926 25.25926 25.25926 25.25926
[41] 25.25926 25.25926 25.25926 25.25926 25.25926 25.25926 25.25926 25.25926
[49] 25.25926 25.25926 25.25926 25.25926 25.25926 25.25926
如果没有一个重复的例子,它是很难给出一个相关的答案,但尝试:
library(dplyr)
df2 <- df %>% # df is your data frame
group_by(`index Z`) %>%
summarise_all(.funs = mean, na.rm = TRUE)
# expected output
left_join(df1[, 1], df2, by = `index Z`)
类似的答案ANG但使用data.table
library(data.table)
df <- setDT(df)
df2 <- df[,lapply(.SD,mean), by = `index Z`]
df2[df, on = `index Z`]
使用库dplyr。检查这个例子:
df1 %>% group_by(index) %>%
summarise(modreturn1 = mean(return1,na.rm = T), modreturn2 = mean(return2,na.rm = T))
它会返回一个表总结了前两个变量到他们的机构(不包括NA
的)。现在,如果你真的想尽可能多的行,你的原始数据集:首先,上述查询保存到一个变量命名resumen
,则:
merge(df1[,"index"],resumen,all.x = T)
别客气 :)