我目前正在处理 OAI 数据集中的骨折数据集。我们有 32 列变量,这些变量是身体不同部位骨折的分类变量。它们要么是“0”、“1”,要么是空白的“”。 因此,我试图在一列和其他变量之间运行逻辑回归。 代码:
new.glm <- function(mydata) {
newgroup <- as.factor( mydata$V00HIPFX.x)
inputdata <- mydata[,39:1230]
tresult <- apply(inputdata, 2, function(x,g) summary(glm(as.numeric(x)~g, family = "binomial", mydata))$coef[,"Pr(>|t|)"], g=as.factor(newgroup))
一旦运行函数的 glm 部分,我的错误是:“对比只能应用于具有 2 个或更多级别的因素”
我尝试将 glm 函数更改为 glm as.factor 而不是 as.numeric,但这给了我错误“eval(family$initialize) 中的错误:y 值必须为 0 <= y <= 1". I've also tried changing g=as.factor(newgroup) to g=as.factor(inputdat), but that also didn't work.
我会尽量简化以准确显示您描述的问题。
你的因变量可能是一个字符向量,像这样
# create a copy of mtcars and break the am column intentionally
mtcars2 <- mtcars
mtcars2$am <- as.character(mtcars2$am)
# replace real values by "" at rows 1,5,10 (no particular reason for these numbers)
mtcars2$am[c(1,5,10)] <- ""
# fit a logit with mpg and wt as explanatory variables for am
glm(am ~ mpg + wt, data = mtcars2, family = binomial())
错误将是
Error in eval(family$initialize) : y values must be 0 <= y <= 1
然而,假设 mtcars2 是起点(即数据以“错误”格式出现),我们可以修复它
# convert to factor, but changing "" to NA
mtcars2$am <- as.factor(as.integer(mtcars2$am))
# try again
glm(am ~ mpg + wt, data = mtcars2, family = binomial())
返回
Call: glm(formula = am ~ mpg + wt, family = binomial(), data = mtcars2)
Coefficients:
(Intercept) mpg wt
23.2200 -0.2811 -5.8190
Degrees of Freedom: 28 Total (i.e. Null); 26 Residual
(3 observations deleted due to missingness)
Null Deviance: 39.34
Residual Deviance: 16.56 AIC: 22.56