如何修复 R 中逻辑回归的变量水平

问题描述 投票:0回答:1

我目前正在处理 OAI 数据集中的骨折数据集。我们有 32 列变量,这些变量是身体不同部位骨折的分类变量。它们要么是“0”、“1”,要么是空白的“”。 因此,我试图在一列和其他变量之间运行逻辑回归。 代码:

new.glm <- function(mydata) {
    
  newgroup <- as.factor( mydata$V00HIPFX.x)
  inputdata <- mydata[,39:1230]
  
  tresult <- apply(inputdata, 2, function(x,g) summary(glm(as.numeric(x)~g, family = "binomial", mydata))$coef[,"Pr(>|t|)"], g=as.factor(newgroup))

一旦运行函数的 glm 部分,我的错误是:“对比只能应用于具有 2 个或更多级别的因素”

我尝试将 glm 函数更改为 glm as.factor 而不是 as.numeric,但这给了我错误“eval(family$initialize) 中的错误:y 值必须为 0 <= y <= 1". I've also tried changing g=as.factor(newgroup) to g=as.factor(inputdat), but that also didn't work.

r logistic-regression
1个回答
0
投票

我会尽量简化以准确显示您描述的问题。

你的因变量可能是一个字符向量,像这样

# create a copy of mtcars and break the am column intentionally
mtcars2 <- mtcars
mtcars2$am <- as.character(mtcars2$am)
# replace real values by "" at rows 1,5,10 (no particular reason for these numbers)
mtcars2$am[c(1,5,10)] <- ""

# fit a logit with mpg and wt as explanatory variables for am
glm(am ~ mpg + wt, data = mtcars2, family = binomial())

错误将是

Error in eval(family$initialize) : y values must be 0 <= y <= 1

然而,假设 mtcars2 是起点(即数据以“错误”格式出现),我们可以修复它

# convert to factor, but changing "" to NA
mtcars2$am <- as.factor(as.integer(mtcars2$am))

# try again
glm(am ~ mpg + wt, data = mtcars2, family = binomial())

返回

Call:  glm(formula = am ~ mpg + wt, family = binomial(), data = mtcars2)

Coefficients:
(Intercept)          mpg           wt  
    23.2200      -0.2811      -5.8190  

Degrees of Freedom: 28 Total (i.e. Null);  26 Residual
  (3 observations deleted due to missingness)
Null Deviance:      39.34 
Residual Deviance: 16.56    AIC: 22.56
© www.soinside.com 2019 - 2024. All rights reserved.