似乎lm
在sapply
中不会将公式作为参数。
只是lm
虽然lm
单独接受公式参数FO
,
summary(lm(y ~ x, df1, df1[["z"]] == 1, df1[["w"]]))$coef[1, ]
summary(lm(FO, data, data[[st]] == st1, data[[ws]]))$coef[1, ]
在lm
的sapply
同样在sapply
内
sapply(unique(df1$z), function(s)
summary(lm(y ~ x, df1, df1[["z"]] == s, df1[[ws]]))$coef[1, ])
sapply(unique(data[[st]]), function(s)
summary(lm(FO, data, data[[st]] == s, data[[ws]]))$coef[1, ])
导致错误:
Error in eval(substitute(subset), data, env) : object 's' not found
把所有东西作为参数,但公式FO
它仍然有效:
sapply(unique(data[[st]]), function(s)
summary(lm(y ~ x, data, data[[st]] == s, data[[ws]]))$coef[1, ])
lm
在for
循环
所有参数都在for
循环中工作:
m <- matrix(NA, 4, length(unique(data[[st]])))
for (s in unique(data[[st]])) {
m[, s] <- summary(lm(FO, data, data[[st]] == s, data[[ws]]))$coef[1, ]
}
m
# [,1] [,2] [,3]
# [1,] 1.6269038 -0.1404174 -0.010338774
# [2,] 0.9042738 0.4577001 1.858138516
# [3,] 1.7991275 -0.3067890 -0.005564049
# [4,] 0.3229600 0.8104951 0.996457853
数据:
df1 <- structure(list(x = c(1.37095844714667, -0.564698171396089, 0.363128411337339,
0.63286260496104, 0.404268323140999, -0.106124516091484, 1.51152199743894,
-0.0946590384130976, 2.01842371387704), y = c(1.30824434809425,
0.740171482827397, 2.64977380403845, -0.755998096151299, 0.125479556323628,
-0.239445852485142, 2.14747239550901, -0.37891195982917, -0.638031707027734
), z = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), w = c(0.7, 0.8,
1.2, 0.9, 1.3, 1.2, 0.8, 1, 1)), class = "data.frame", row.names = c(NA,
-9L))
FO <- y ~ x; data <- df1; st <- "z"; ws <- "w"; st1 <- 1
sessionInfo()
:
R version 3.6.0 (2019-04-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
Matrix products: default
locale:
[1] LC_COLLATE=German_Switzerland.1252 LC_CTYPE=German_Switzerland.1252
[3] LC_MONETARY=German_Switzerland.1252 LC_NUMERIC=C
[5] LC_TIME=German_Switzerland.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_3.6.0 tools_3.6.0 yaj_0.0.0.9044 packrat_0.5.0
这在我尝试时起作用了。看起来你在公式中使用x会干扰你喜欢这个函数的行为方式。用num替换这个参数会产生听起来像你正在寻找的结果。这样,它可以确保公式中的x引用数据集而不是函数参数。
sapply(unique(dat$z), function(num) summary(lm(y ~ x, dat, z == num))$coef[1, ])
感谢来自R-help的@David提示尝试使用do.call
,我可以搞清楚。解决方案是:
sapply(unique(data[[st]]), function(s)
summary(do.call("lm", list(FO, data, data[[st]] == s,
data[[ws]])))$coef[1, ])
# [,1] [,2] [,3]
# Estimate 1.6269038 -0.1404174 -0.010338774
# Std. Error 0.9042738 0.4577001 1.858138516
# t value 1.7991275 -0.3067890 -0.005564049
# Pr(>|t|) 0.3229600 0.8104951 0.996457853
说明:(来自R-help的@Duncan学分)sapply
的调用者可能会忽略附加的> environment(FO)
# <environment: R_GlobalEnv>
,其中创建了公式。这可能是它与do.call
和参数列表一起工作的原因。