丢弃多级分类变量回归R

问题描述 投票:1回答:1

我正在进行逻辑回归,我的自变量之一是分类的(年份:2010、2012、2016)。我将2010设置为参考变量。但是,当我运行回归时,输出会降低2012年的水平……以前有人见过这个,还是知道为什么它会降低其中一个水平比较?谢谢!

更新为显示自变量和完整的输出

变量设置:

year <- factor(year)
year <- relevel(year, ref="2010")

gender[gender==2] <- 0 # female
gender <- factor(gender)

> table(data$Year, data$A1_Gender)

         1   2
  2010 104  13
  2012  99  11
  2016 115  16

输出:

> log <- glm(d2~year + gender, family="binomial"(link="logit"))
> summary(log)

Call:
glm(formula = d2 ~ year + gender, family = binomial(link = "logit"))

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-0.9989  -0.9441  -0.9339   1.4302   1.4422  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.43561    0.52008  -0.838    0.402
year2016    -0.02682    0.31424  -0.085    0.932
gender1     -0.14151    0.51136  -0.277    0.782

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 232.52  on 177  degrees of freedom
Residual deviance: 232.44  on 175  degrees of freedom
  (181 observations deleted due to missingness)
AIC: 238.44

Number of Fisher Scoring iterations: 4

来自用户20650建议的结果

> md = na.omit(ca[c("d2", "year", "gender")])
> table(md$year)

2016 2010 2012 
  98    0   80 
r logistic-regression categorical-data
1个回答
0
投票

user20650是正确的,我有一个缺少数据的问题。谢谢大家的帮助和建议!

© www.soinside.com 2019 - 2024. All rights reserved.