我的老师给了我回归结果,练习是对导致回归的数据集进行逆向工程。然后我们需要对其进行回归并找到完全相同的结果。
我设法将系数达到-19.93,但我根本没有得到相同的SE。我不知道该怎么做。我不知道是否应该使用一些公式来连接估计器的 SE 和回归的标准误差(我有一些公式,但我真的不知道在 R 中实现它们的方法)...提前感谢你的帮助!
我的 R 输出:
## Given values
n <- 1592
se_β1 <- 1.47
β1hat <- -19.93
## Create a dummy v aiable
Low_anchor <- c(rep(0, Nc/2), rep(1, Nc/2))
## Formula of standard error of beta 1 (assuming homoskedasticity)
calculate_standard_error <- function(u, Low_anchor) {
sqrt((1/(n - 2))*sum(u^2)/(n*sd(Low_anchor)^2))
}
## Define initial values of u
u <- rnorm(n)
## Tolerance for convergence
tolerance <- 0.1
## Iteratively adjust u until the standard error matches the target
while (abs(calculate_standard_error(u, Low_anchor) - se_β1) > tolerance) {
## Generate new set of values for u from a normal distribution
u <- rnorm(n)
}
print(u)
## regression
Yc <- -19.93*Low_anchor + u
model1 <- lm(Yc ~ Low_anchor - 1)
## Print the summary of the model
summary(model1)
看起来
Nc
没有定义。我认为你正在调用自变量。这里我将使用x
。请注意,这个问题似乎要求您以 y = bx + u 的形式使用(多重)回归情况下的标准误差和系数的属性。你必须知道这一点
有了这个,你可以写一个简单的样子,先调整u,然后调整x。首先我们定义一些初选:
n<- 1592
se_b1 <- 1.47
b1hat <- -19.93
set.seed(2)
x <- rnorm(n)
y_mean <- -19.93*x
# we are going to create a random variable to be the residuals
u <- rnorm(n,0,1)
error <- 1
tol <- 0.01
请注意,这些分布与答案无关。您可以检查 u 的均值是否与我们想要的相差甚远。您还可以检查运行后会发生什么 是 <- y_mean + u summary(lm(y ~ x)) The coefficient and the standard error will be different from what you want. How to fix that? Using the two properties we mentioned above.
error <- 1
tol <- 0.01
while (error > tol) {
# remember that the se_b1 is constructed as the rood of the diagonal of sigma^2 * (X'X)^-1
# determine the matrix of X (assuming there is an intercept here)
X <- matrix(c(rep(1, n), x), ncol = 2)
XX_minus_one <- solve(t(X) %*% X)
# so far, we would get a standard deviation os
present_se <- sqrt(var(u) * XX_minus_one[2,2])
# this is different. Let's adjust the residuals to have the desired variance
u_fitting <- u * se_b1 / present_se
y <- y_mean + u
reg <- lm(y ~ x)
estimated_b1 <- reg$coefficients[2]
estimated_se_b1 <- summary(reg)$coefficients[2,2]
error <- max(abs(estimated_se_b1 - se_b1), abs(estimated_b1 - b1hat))
# but now we need to refit the x
x <- x * estimated_b1/b1hat
u <- u_fitting
}
您可以检查它是否正常工作:
summary(lm(y ~ x))