在R中绘制模型的系数

问题描述 投票:0回答:1

我正在使用glm()拟合训练数据,并想绘制系数。但是,我不知道如何给出如下正确的情节:

enter image description here

set.seed(1)
trn_index = createDataPartition(y = development$EQUAL_PAY, p = 0.80, list = FALSE)
trn_pay = development[trn_index, ]
tst_pay = development[-trn_index, ]

trn_pay_f <- trn_pay %>%
  mutate(EQUAL_PAY = relevel(factor(EQUAL_PAY),ref = "YES"))

pay_lgr = train(EQUAL_PAY ~ .- EQUAL_WORK - COUNTRY, method = "glm", family = binomial(link = "logit"), data = trn_pay_f,trControl = trainControl(method = 'cv', number = 10))

summary(pay_lgr)
##Coefficients:
                             Estimate Std. Error z value Pr(>|z|)  
(Intercept)                -2.560e+00  2.552e+00  -1.003   0.3158  
GDP_PER_CAP                -5.253e-05  3.348e-05  -1.569   0.1167  
CO2_PER_CAP                 1.695e-01  7.882e-02   2.151   0.0315 *
PERC_ACCESS_ELECTRICITY    -7.833e-03  1.249e-02  -0.627   0.5304  
ATMS_PER_1E5               -2.473e-03  8.012e-03  -0.309   0.7576  
PERC_INTERNET_USERS        -2.451e-02  2.047e-02  -1.198   0.2310  
SCIENTIFIC_ARTICLES_PER_YR  2.698e-05  1.519e-05   1.776   0.0757 .
PERC_FEMALE_SECONDARY_EDU   1.126e-01  5.934e-02   1.897   0.0578 .
PERC_FEMALE_LABOR_FORCE    -6.559e-03  1.477e-02  -0.444   0.6569  
PERC_FEMALE_PARLIAMENT     -4.786e-02  2.191e-02  -2.184   0.0289 *

## extract all parameters in a dataframe
pay_lgrFrame <- data.frame(COEFFICIENT = rownames(summary(pay_lgr)$coef),
p_value = summary(pay_lgr)$coef[,4],
z_value = summary(pay_lgr)$coef[,3],
SE = summary(pay_lgr)$coef[,2],
Estimate = summary(pay_lgr)$coef[,1])

## and I was stuck in making a plot as the image I posted the link above.
r
1个回答
1
投票

拉入摘要表(您可以直接将其作为ss <- coef(summary(pay_lgr)),但我没有您的数据集):

ss <- read.delim(header=TRUE,check.names=FALSE,text="
Estimate    Std. Error  z value Pr(>|z|)  
(Intercept) -2.560e+00  2.552e+00   -1.003  0.3158
GDP_PER_CAP -5.253e-05  3.348e-05   -1.569  0.1167
CO2_PER_CAP 1.695e-01   7.882e-02   2.151   0.0315
PERC_ACCESS_ELECTRICITY -7.833e-03  1.249e-02   -0.627  0.5304
ATMS_PER_1E5    -2.473e-03  8.012e-03   -0.309  0.7576
PERC_INTERNET_USERS -2.451e-02  2.047e-02   -1.198  0.2310
SCIENTIFIC_ARTICLES_PER_YR  2.698e-05   1.519e-05   1.776   0.0757
PERC_FEMALE_SECONDARY_EDU   1.126e-01   5.934e-02   1.897   0.0578
PERC_FEMALE_LABOR_FORCE -6.559e-03  1.477e-02   -0.444  0.6569
PERC_FEMALE_PARLIAMENT  -4.786e-02  2.191e-02   -2.184  0.0289")

将行名称转换为名为term的列:

ss2 <- tibble::rownames_to_column(ss,"term")

绘制条形图:

library(ggplot2)
ggplot(ss2, aes(term,Estimate))+
      geom_bar(stat="identity")+
      coord_flip()
ggsave("bar.png")

enter image description here

正如其他人所评论的那样,可能更好(在视觉传达方面更容易和更可取)绘制系数的方法。 dotwhisker::dwplot()函数做了几件方便的事情:

  • 自动提取系数并绘制它们
  • 通过2 * std dev自动缩放连续预测变量,以便在系数之间进行比较(如果你不想这样,可以使用by_2sd=FALSE
  • 自动地省略了与其他参数不同的截距,并且很少有推论
library(dotwhisker)
dwplot(lm(Murder/Population ~ ., data=as.data.frame(state.x77)))

enter image description here

© www.soinside.com 2019 - 2024. All rights reserved.