每列总和为 1 的混淆矩阵

Question

我正在尝试制作一个混淆矩阵，使用以下数据显示面部表情的正确/错误猜测：

> dput(conf_mat)
structure(list(Target = c("Angry", "Angry", "Angry", "Angry", 
"Angry", "Angry", "Angry", "Disgusted", "Disgusted", "Disgusted", 
"Disgusted", "Disgusted", "Disgusted", "Disgusted", "Fearful", 
"Fearful", "Fearful", "Fearful", "Fearful", "Fearful", "Fearful", 
"Happy", "Happy", "Happy", "Happy", "Happy", "Happy", "Happy", 
"Neutral", "Neutral", "Neutral", "Neutral", "Neutral", "Neutral", 
"Neutral", "Sad", "Sad", "Sad", "Sad", "Sad", "Sad", "Sad", "Surprised", 
"Surprised", "Surprised", "Surprised", "Surprised", "Surprised", 
"Surprised"), Prediction = c("Angry", "Disgusted", "Fearful", 
"Happy", "Neutral", "Sad", "Surprised", "Angry", "Disgusted", 
"Fearful", "Happy", "Neutral", "Sad", "Surprised", "Angry", "Disgusted", 
"Fearful", "Happy", "Neutral", "Sad", "Surprised", "Angry", "Disgusted", 
"Fearful", "Happy", "Neutral", "Sad", "Surprised", "Angry", "Disgusted", 
"Fearful", "Happy", "Neutral", "Sad", "Surprised", "Angry", "Disgusted", 
"Fearful", "Happy", "Neutral", "Sad", "Surprised", "Angry", "Disgusted", 
"Fearful", "Happy", "Neutral", "Sad", "Surprised"), N = c(456L, 
31L, 14L, 1L, 11L, 46L, 1L, 92L, 454L, 3L, 2L, 1L, 4L, 4L, 2L, 
40L, 382L, 1L, 1L, 10L, 124L, 0L, 2L, 0L, 552L, 3L, 2L, 1L, 3L, 
2L, 2L, 7L, 528L, 16L, 2L, 8L, 30L, 17L, 4L, 19L, 481L, 1L, 0L, 
4L, 20L, 3L, 2L, 4L, 527L)), row.names = c(NA, -49L), class = c("tbl_df", 
"tbl", "data.frame"))

按照在线教程，我已经能够得到：

library(cvms)
plot_confusion_matrix(conf_mat,
                      class_order=c("Surprised", "Disgusted", "Fearful", "Angry",  "Sad", "Happy", "Neutral"),
                      add_counts=FALSE,
                      add_row_percentages=FALSE,
                      add_col_percentages=FALSE
                      )

上图显示了每个单元格在整个数据集中出现的频率。我希望它显示每个预测在单个列中的常见程度，使得每列总和为 1。我该如何做到这一点（不切换到 python）？这是我正在寻找的示例（来自其他人的工作；请注意，轴被翻转，因此在下面的照片中，行总和为 1，而不是列）：

Answer 1

可以通过

计算出

Target

的比例：

library(ggplot2)
library(dplyr)

lvl <- c("Surprised", "Disgusted", "Fearful", "Angry",  "Sad", "Happy", "Neutral")

conf_mat |>
  mutate(Target = factor(Target, levels = rev(lvl)),
         Prediction = factor(Prediction, levels = lvl)) |>
  mutate(prop = N / sum(N),
         prop_col = if_else(prop > .5, "white", "black"), .by = Target) |>
  ggplot(aes(x = Target, y = Prediction, fill = prop)) +
  geom_tile() +
  geom_text(aes(label = scales::label_percent(.1)(prop), color = prop_col)) +
  scale_fill_gradient(low = "#EEF2F8", high = scales::muted("blue"), guide = "none") +
  scale_color_identity() +
  scale_x_discrete(position = "top") +
  theme_minimal()

每列总和为 1 的混淆矩阵

问题描述投票：0回答：1

1个回答

最新问题

每列总和为 1 的混淆矩阵

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1