ggplot排序轴,带有翻转坐标和刻面图

问题描述 投票:2回答:1

我有一个看起来像这样的数据集(LDA输出)。

lda_tt <- tidy(ldaOut)

lda_tt <- lda_tt %>%
        group_by(topic) %>%
        top_n(10, beta) %>%
        ungroup() %>%
        arrange(topic, -beta)

    topic   term    beta
1   1   council 0.044069733
2   1   report  0.020086205
3   1   budget  0.016918569
4   1   polici  0.01646605
5   1   term    0.015051927
6   1   annual  0.014938797
7   1   control 0.014316583
8   1   audit   0.013637803
9   1   rate    0.012732765
10  1   fund    0.011997421
11  2   debt    0.033760856
12  2   plan    0.030379431
13  2   term    0.02925229
14  2   fiscal  0.021836885
15  2   polici  0.017802904
16  2   mayor   0.015548621
17  2   transpar0.013175692
18  2   relat   0.012997722
19  2   capit   0.012463813
20  2   long    0.011989227
21  2   remain  0.011989227
22  3   parti   0.031795751
23  3   elect   0.029929187
24  3   govern  0.025496098
25  3   mayor   0.023046232
26  3   district0.014588364
27  3   public  0.014471704
28  3   administr0.013596752
29  3   budget  0.011730188
30  3   polit   0.011730188
31  3   seat    0.010563586
32  3   state   0.010563586
33  4   budget  0.037069484
34  4   revenu  0.025043026
35  4   account 0.018459577
36  4   oper    0.01721546
37  4   tax     0.015867667
38  4   debt    0.014416198
39  4   compani 0.013690464
40  4   expenditur0.012135318
41  4   consolid0.011305907
42  4   increas 0.010891202
43  5   invest  0.026534237
44  5   elect   0.023341538
45  5   administr0.022296654
46  5   improv  0.02189031
47  5   develop 0.019162003
48  5   project 0.017826874
49  5   transport0.016375647
50  5   local   0.016317598
51  5   infrastr0.014401978
52  5   servic  0.014111733

我想根据beta订购的条款创建5个主题图。这是代码

    lda_tt %>%
        mutate(term = reorder(term, beta)) %>%
        ggplot(aes(term, beta, fill = factor(topic))) +
        geom_bar(alpha = 0.8, stat = "identity", show.legend = FALSE) +
        facet_wrap(~ topic, scales = "free") +
        coord_flip()

我得到这个graphTerms by beta正如你所看到的,尽管进行了分类工作,但这些术语并非按照beta排序,例如,术语“预算”应该是主题4中的首要术语,并且“投资”在主题的顶部5,等。如何在每个图表上对每个主题中的术语进行排序?有关ggplot排序的stackoverflow有几个问题,但这些都没有帮助我解决问题。

r ggplot2 lda
1个回答
1
投票

link建议的Tung提供了解决问题的方法。似乎每个术语都需要编码为一个独特的因素,以获得适当的排序。我们可以在每个术语中添加“_”和主题编号(在第2行和第3行中完成),但只显示没有“_”的术语和主题编号(最后一行代码处理)。以下代码生成具有正确排序的分面图。

    lda_tt %>%

        mutate(term = factor(paste(term, topic, sep = "_"),
                             levels = rev(paste(term, topic, sep = "_")))) %>%#convert to factor

        ggplot(aes(term, beta, fill = factor(topic))) +
        geom_bar(alpha = 0.8, stat = "identity", show.legend = FALSE) +
        facet_wrap(~ topic, scales = "free") +
        coord_flip() + 

        scale_x_discrete(labels = function(x) gsub("_.+$", "", x)) #remove "_" and topic number
© www.soinside.com 2019 - 2024. All rights reserved.