我正在使用geom_jitter()
作为ggplot的箱线图。我注意到它为箱线图顶部的每个记录添加了一个点,而不是仅仅代表异常值的点。
这段代码证明了这一点。
data <- as.data.frame(c(rnorm(10000, mean = 10, sd = 20), rnorm(300, mean = 90, sd = 5)))
names(data) <- "blapatybloo"
data %>% ggplot(aes("column", blapatybloo)) + geom_boxplot() + geom_jitter(alpha=.1)
如何将geom_jitter
仅应用于箱线图上的点而不重叠其余记录?
创建新列以确定数据点是否为异常值。然后将点叠加到箱线图上。
data <- as.data.frame(c(rnorm(10000, mean = 10, sd = 20),
rnorm(300, mean = 90, sd = 5)))
names(data) <- "blapatybloo"
data <- data %>%
mutate(outlier = blapatybloo > median(blapatybloo) +
IQR(blapatybloo)*1.5 | blapatybloo < median(blapatybloo) -
IQR(blapatybloo)*1.5)
data %>%
ggplot(aes("column", blapatybloo)) +
geom_boxplot(outlier.shape = NA) +
geom_point(data = function(x) dplyr::filter(x, outlier),
position = "jitter")