我正在尝试在R中转换以下SAS代码,以获得与从SAS获得的结果相同的结果。这是SAS代码:
DATA plants;
INPUT sample $ treatmt $ y ;
cards;
1 trt1 6.426264755
1 trt1 6.95419631
1 trt1 6.64385619
1 trt2 7.348728154
1 trt2 6.247927513
1 trt2 6.491853096
2 trt1 2.807354922
2 trt1 2.584962501
2 trt1 3.584962501
2 trt2 3.906890596
2 trt2 3
2 trt2 3.459431619
3 trt1 2
3 trt1 4.321928095
3 trt1 3.459431619
3 trt2 3.807354922
3 trt2 3
3 trt2 2.807354922
4 trt1 0
4 trt1 0
4 trt1 0
4 trt2 0
4 trt2 0
4 trt2 0
;
RUN;
PROC MIXED ASYCOV NOBOUND DATA=plants ALPHA=0.05 method=ML;
CLASS sample treatmt;
MODEL y = treatmt ;
RANDOM int treatmt/ subject=sample ;
RUN;
我从SAS获得以下协方差估计:
拦截样本==> 5.5795
处理样本==> -0.08455
剩余==> 0.3181
我在R中尝试了以下方法,但是得到了不同的结果。
s=as.factor(sample)
lmer(y~ 1+treatmt+(1|treatmt:s),REML=FALSE)
感谢您的帮助。
谢谢,
格拉姆
我不知道您是否能够从SAS到R获得准确的结果,但是通过处理此处概述的contrast
,我可以接近:
[lmer for SAS PROC MIXED Users:第6页
[比较SAS PROC MIXED和lmer one产生的估计必须谨慎考虑用于定义因素的影响。在SAS中,具有截距和定性的模型根据截距和指标定义因子除最后一级因素外的所有变量。默认值S中的行为是使用Helmert对比作为因子。在平衡因子可提供一组正交对比。在R中默认情况下,“处理”对比与SAS参数化,只是它们删除了第一个的指标级别,而不是最后一个级别。如有疑问,请检查哪些对比与对比功能一起使用。为了使比较容易,您可能会觉得值得声明
options(contrasts = c(factor = "contr.SAS", ordered = "contr.poly"))
在会话开始时。
dput:
df <- structure(list(sample = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L),
treatmt = c("trt1", "trt1", "trt1", "trt2", "trt2", "trt2",
"trt1", "trt1", "trt1", "trt2", "trt2", "trt2", "trt1", "trt1",
"trt1", "trt2", "trt2", "trt2", "trt1", "trt1", "trt1", "trt2",
"trt2", "trt2"), y = c(6.426264755, 6.95419631, 6.64385619,
7.348728154, 6.247927513, 6.491853096, 2.807354922, 2.584962501,
3.584962501, 3.906890596, 3, 3.459431619, 2, 4.321928095,
3.459431619, 3.807354922, 3, 2.807354922, 0, 0, 0, 0, 0,
0)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-24L), .Names = c("sample", "treatmt", "y"))
当前代码:
options(contrasts = c(factor = "contr.SAS", ordered = "contr.poly"))
df$sample=as.factor(df$sample)
lmer(y~ 1+treatmt+(1|treatmt:sample),REML=FALSE, data = df)
当前输出:
Linear mixed model fit by maximum likelihood ['lmerMod']
Formula: y ~ 1 + treatmt + (1 | treatmt:sample)
Data: df
AIC BIC logLik deviance df.resid
80.3564 85.0686 -36.1782 72.3564 20
Random effects:
Groups Name Std.Dev.
treatmt:sample (Intercept) 2.344
Residual 0.564
Number of obs: 24, groups: treatmt:sample, 8
Fixed Effects:
(Intercept) treatmttrt1
3.3391 -0.1072
您正在使用SAS选项NOBOUND
,该选项允许对方差进行负估计,而您得到的是负估计。对于lmer
,这是不可能的,它将约束方差限制为正。
我们可以尝试手动获取SAS结果。首先,请注意,等效的lmer
语法为:
lmer(y ~ 1 + treatment + (1+treatment|sample), REML=FALSE, data = dat)
让我们最大化对数可能性,允许出现负方差:
dattxt <- "1 trt1 6.426264755
1 trt1 6.95419631
1 trt1 6.64385619
1 trt2 7.348728154
1 trt2 6.247927513
1 trt2 6.491853096
2 trt1 2.807354922
2 trt1 2.584962501
2 trt1 3.584962501
2 trt2 3.906890596
2 trt2 3
2 trt2 3.459431619
3 trt1 2
3 trt1 4.321928095
3 trt1 3.459431619
3 trt2 3.807354922
3 trt2 3
3 trt2 2.807354922
4 trt1 0
4 trt1 0
4 trt1 0
4 trt2 0
4 trt2 0
4 trt2 0
"
dat <- read.table(text = dattxt)
names(dat) <- c("sample", "treatment", "y")
dat$sample <- as.factor(dat$sample)
opts <- options(contrasts = c(factor = "contr.SAS", ordered = "contr.poly"))
library(lme4)
fit <- lmer(y ~ 1 + treatment + (1+treatment|sample), REML=FALSE, data = dat)
# marginal variance matrix in function of variance components
Vfun <- function(fit, vcs){
Z <- getME(fit, "Z")
n <- getME(fit, "n")
l_i <- getME(fit, "l_i")
sigma2_a <- vcs[1]
sigma2_b <- vcs[2]
sigma_ab <- vcs[3]
sigma2 <- vcs[4]
G <- matrix(c(sigma2_a, sigma_ab, sigma_ab, sigma2_b), nrow = 2)
R <- Diagonal(n, sigma2)
Z %*% bdiag(rep(list(G),l_i)) %*% t(Z) + R
}
# minus log-likelihood
library(mvtnorm)
logLHD <- function(params, fit){
X <- getME(fit, "X")
beta <- params[1:ncol(X)]
y <- getME(fit, "y")
vcs <- tail(params, length(params)-ncol(X))
V <- as.matrix(Vfun(fit, vcs))
if(any(eigen(V)$values <= 0)){
return(runif(1, 1e7, 1e8)) # return a high-value if V is not positive
}
-dmvnorm(y, c(X%*%beta), sigma = V, log = TRUE)
}
# optimization of log-likelihood
library(dfoptim)
start <-
c(fixef(fit), vc$sample[1,1], vc$sample[2,2], vc$sample[1,2], sigma(fit)^2)
names(start)[3:6] <-
c("sample.Intercept", "sample.trt1", "covariance", "sigma2")
opt <- hjkb(start, logLHD, lower=c(-Inf,-Inf,-Inf,-Inf,-Inf,0), fit=fit)
### results
opt$par
# (Intercept) treatmenttrt1 sample.Intercept sample.trt1 covariance sigma2
# 3.33912840 -0.10721533 5.50671885 -0.16909628 0.07275635 0.31812378
剩余方差与使用SAS获得的残差相同。要获得其他SAS结果,必须对我们的结果进行一些体操训练,我不知道为什么,但是我们以这种方式得到它们:
### SAS results
opt$par[["sample.Intercept"]] + opt$par[["covariance"]]
# 5.579475
opt$par[["sample.trt1"]] / 2
# -0.08454814
注意,对数似然性的确更好地具有负方差:
### remark: lmer achieves a lower log-likelihood
logLik(fit)
# 'log Lik.' -27.88947 (df=6)
-opt$value
# -26.43355
如果有人可以解释所需的体操运动,我将不胜感激...