从glm（）中删除完全分开的观察结果

Question

我正在使用HMDA软件包中的AER数据进行一些探索性数据分析；但是，我用来拟合模型的变量似乎包含一些可以完美确定结果的观察结果，这一问题称为“分离”。因此，我尝试使用this thread推荐的解决方案对此进行补救，但是当我尝试从glm.fit()执行第一组源代码时，R返回了一条错误消息：

Error in family$family : object of type 'closure' is not subsettable

因此，我无法继续使用此代码从我的数据中删除那些完全确定的观察结果。我想知道是否有人可以帮助我解决此问题？

下面提供了我当前的代码供您参考。

# load the AER package and HMDA data
library(AER)
data(HMDA)

# fit a 2-degree olynomial probit model 
probit.fit <- glm(deny ~ poly(hirat, 2), family = binomial, data = HMDA)

# using the revised source code from that stackexchage thread to find out observations that received a warning message
library(tidyverse)
library(dplyr)
library(broom)

eps <- 10 * .Machine$double.eps
if (family$family == "binomial") {
  if (any(mu > 1 - eps) || any(mu < eps)) 
    warning("glm.fit: fitted probabilities numerically 0 or 1 occurred", 
            call. = FALSE)
}

# this return the following error message
# Error in family$family : object of type 'closure' is not subsettable


probit.resids <- augment(probit.fit) %>%
  mutate(p = 1 / (1 + exp(-.fitted)),
         warning = p > 1-eps)

arrange(probit.resids, desc(.fitted)) %>%  
  select(2:5, p, warning) %>% 
  slice(1:10)


HMDA.nwarning <- filter(HMDA, !probit.resids$warning)

# using HMDA.nwarning should solve the problem...
probit.fit <- glm(deny ~ poly(hirat, 2), family = binomial, data = HMDA.nwarning)

Answer 1

这段代码

if (family$family == "binomial") {
  if (any(mu > 1 - eps) || any(mu < eps)) 
    warning("glm.fit: fitted probabilities numerically 0 or 1 occurred", 
            call. = FALSE)
}

存在一个函数，当您使用family ==“ binomial”运行glm时会调用binomial()。如果查看glm（只需键入glm）：

if (is.character(family)) 
        family <- get(family, mode = "function", envir = parent.frame())
    if (is.function(family)) 
        family <- family()
    if (is.null(family$family)) {
        print(family)
        stop("'family' not recognized")
    }

并且glm函数在拟合过程中检查binomial（）$ family，并且如果任何预测值与eps的1或0都不相同，则会引发该警告。

您不需要运行该部分，是的，您需要设置eps <- 10 * .Machine$double.eps。因此，让我们运行下面的代码，如果要运行一个概率，则需要以二项式指定link="probit"，否则默认为logit：

library(AER)
library(tidyverse)
library(dplyr)
library(broom)

data(HMDA)

probit.fit <- glm(deny ~ poly(hirat, 2), family = binomial(link="probit"), data = HMDA)

eps <- 10 * .Machine$double.eps

probit.resids <- augment(probit.fit) %>%
  mutate(p = 1 / (1 + exp(-.fitted)),
         warning = p > 1-eps)

列警告指示观察值是否发出警告，在此数据集中，有一个：

table(probit.resids$warning)

FALSE  TRUE 
 2379     1

我们可以使用下一步进行过滤

HMDA.nwarning <- filter(HMDA, !probit.resids$warning)
dim(HMDA.nwarning)
[1] 2379   14

并重新运行回归：

probit.fit <- glm(deny ~ poly(hirat, 2), family = binomial(link="probit"), data = HMDA.nwarning)
coefficients(probit.fit)
(Intercept) poly(hirat, 2)1 poly(hirat, 2)2 
      -1.191292        8.708494        6.884404

从glm（）中删除完全分开的观察结果

问题描述投票：1回答：1

1个回答

最新问题

从glm（）中删除完全分开的观察结果

问题描述 投票：1回答：1

1个回答

最新问题

问题描述投票：1回答：1