R Sargan 测试结果与手动方法对比

Question

我一直在尝试在 R 中手动重现 Sargan 测试给出的结果，遗憾的是没有成功。

当我运行

ivreg()

然后输出 Sargan 测试统计数据时：

eitc <- read.dta13('education_earnings_v2.dta') 
eitc$ln.wage <- log(eitc$wage) 

TSLS <- ivreg(data = eitc, ln.wage ~ educ + exper + south + nonwhite 
                           | nearc4 + nearc2 + exper + south + nonwhite)

summary(TSLS, diagnostics=TRUE)

我得到的 Sargan 统计值为 1.63。但是，当我尝试手动执行测试时：

surp_IV1 <- lm(educ ~ nearc2 + nearc4 + exper + south + nonwhite, data=eitc)
surp_IV_fit <- surp_IV1$fitted.values
surp_IV2 <- lm(ln.wage ~ surp_IV_fit + exper + south + nonwhite, data=eitc)

surp_resid <- resid(surp_IV2)

test_surplus <- lm(surp_resid ~ nearc2 + nearc4 + exper + south + nonwhite,
                   data = eitc)

summary(test_surplus)

对于 3,010 个观测值，R 平方 = 0.0008032，我得到的检验统计量为 2.42。

差异的原因是什么？

Answer 1

我想有些步骤是不必要的。

library(wooldridge) # data(card)
library(dplyr)      # rename()
library(ivreg)      # ivreg()

data(card)

eitc <- card |> 
  rename(nonwhite = black)

TSLS <- ivreg(lwage ~ educ + exper + south + nonwhite 
              | nearc4 + nearc2 + exper + south + nonwhite,
              data = eitc)

summary(TSLS, diagnostics = TRUE)
#> 
#> Call:
#> ivreg(formula = lwage ~ educ + exper + south + nonwhite | nearc4 + 
#>     nearc2 + exper + south + nonwhite, data = eitc)
#> 
#> Residuals:
#>       Min        1Q    Median        3Q       Max 
#> -2.309361 -0.319674  0.007403  0.334821  1.783133 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  2.17357    0.70494   3.083  0.00207 ** 
#> educ         0.24131    0.04089   5.901 4.02e-09 ***
#> exper        0.10453    0.01660   6.296 3.50e-10 ***
#> south       -0.08534    0.02647  -3.224  0.00128 ** 
#> nonwhite    -0.01541    0.04644  -0.332  0.74009    
#> 
#> Diagnostic tests:
#>                   df1  df2 statistic  p-value    
#> Weak instruments    2 3004    19.682 3.22e-09 ***
#> Wu-Hausman          1 3004    27.580 1.61e-07 ***
#> Sargan              1   NA     1.626    0.202    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.4998 on 3005 degrees of freedom
#> Multiple R-Squared: -0.2668, Adjusted R-squared: -0.2685 
#> Wald test:  87.8 on 4 and 3005 DF,  p-value: < 2.2e-16

TSLS_resid <- resid(TSLS)

surp_IV1 <- lm(TSLS_resid ~ nearc2 + nearc4 + exper + south + nonwhite,
               data = eitc)

nobs(surp_IV1) * summary(surp_IV1)[["r.squared"]] # number of observations * R-squared
#> [1] 1.625811

^{创建于 2023-12-15，使用 reprex v2.0.2}

R Sargan 测试结果与手动方法对比

问题描述投票：0回答：1

1个回答

最新问题

R Sargan 测试结果与手动方法对比

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1