我在
dur
中有 Utterance
的音节,并希望将它们绘制在 geom_tile
中。我在下面执行此操作的方式不是最佳的,因为色标面向数据帧中 all Utterance
中的最大音节持续时间数。
library(ggplot2)
df %>%
ggplot(aes(x = start + dur/2, y = MM, width = dur, height = 0.3, fill = rank)) +
geom_tile(size = 0.65, color = "white") +
facet_wrap(~ Utterance, scales = "free_x", ncol = 1)+
labs(fill = "SyllDur")
我想要的是一个色标,无论它包含多少个音节,每个
Utterance
的基本范围都是相同的。也就是说,即使 Utterance
只有 3 个音节,具有最小和最大持续时间的音节也将与具有 10 个音节的 Utterance
中最小和最大持续时间具有相同的色调。
如何实现?
df <- structure(list(Utterance = structure(c(1L, 4L, 4L, 1L, 4L, 1L,
1L, 4L, 1L, 4L, 4L, 1L, 4L, 4L, 4L, 4L, 4L, 4L), levels = c("C: [what's] a !mountain! for you:",
"NA: (0.554)", "C: ((v: laughs))=", "A: =I don't know the (Schauinsland) is a small mount[ain]"
), class = "factor"), MM = c("syll", "syll", "syll", "syll",
"syll", "syll", "syll", "syll", "syll", "syll", "syll", "syll",
"syll", "syll", "syll", "syll", "syll", "syll"), start = c(407,
3967, 4827, 1147, 3682, 577, 847, 4697, 0, 4317, 3807, 1340,
5287, 4457, 3437, 4037, 4917, 5517), dur = c(170L, 70L, 90L,
193L, 125L, 270L, 300L, 130L, 407L, 140L, 160L, 480L, 230L, 240L,
245L, 280L, 370L, 538L), rank = c(1, 1, 2, 2, 3, 3, 4, 4, 5,
5, 6, 6, 7, 8, 9, 10, 11, 12)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -18L))
一种方法可能是将每个排名标准化为该话语的最大排名的百分比:
df |>
dplyr::mutate(rank_norm = rank / max(rank), .by = Utterance) |>
ggplot(aes(x = start + dur/2, y = MM, width = dur, height = 0.3, fill = rank_norm)) +
geom_tile(size = 0.65, color = "white") +
facet_wrap(~ Utterance, scales = "free_x", ncol = 1)+
labs(fill = "SyllDur")