使用alpha减少R包ggplot2中的过度绘图时的显着运行时膨胀

Question

我有一组中等大小的数据正在尝试可视化，nrow(df)=7810。为了减少过度绘图，我使用了alpha=.3。这大大减慢了R生成图形所需的时间。这是我的规格，

OS Name Microsoft Windows 10 Home
Version 10.0.18362 Build 18362
Processor Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz, 3401 Mhz, 4 Core(s), 8 Logical Processor(s)
Installed Physical Memory (RAM) 32.0 GB
System Type x64-based PC
R version 3.6.1 (2019-07-05) -- "Action of the Toes"
ggplot2 version 3.2.1

这是正在发生的事情的示例，

> p <-ggplot(df, aes(x=x))
> t1<-function(){p + geom_point(aes(y=y), shape=4, size=.5)}
> t2<-function(){p + geom_point(aes(y=y), shape=4, size=.5, alpha=.3)}

> system.time(print(t1()))
   user  system elapsed 
   0.14    0.37    0.53 

> system.time(print(t2()))
   user  system elapsed 
   0.25   29.69   30.04

有人知道导致此脚本运行缓慢的原因吗？

Answer 1

仅alpha值与减速无关。 alpha值与形状相结合似乎会减慢速度。

与shape = 4渲染的“ x”之类的复杂矢量形状在与alpha值一起使用时似乎大大减慢了渲染时间。如果您不承诺shape = 4，则使用shape = 16之类的东西可以在使用所需的alpha值的同时加快速度。以下示例：

library(dplyr)
library(ggplot2)
df <- tibble(x = rnorm(n = 7810),
             y = rnorm(n = 7810))

p1 <- function() {
  p <- ggplot(df) +
    geom_point(aes(x, y), shape=4, size=.5)
  print(p)
}

p2 <- function() {
  p <- ggplot(df) +
    geom_point(aes(x, y), shape=4, size=.5, alpha = 0.3)
  print(p)
}

p3 <- function() {
  p <- ggplot(df) +
    geom_point(aes(x, y), shape=16, size=.5, alpha = 0.3)
  print(p)
}

p4 <- function() {
  p <- ggplot(df) +
    geom_point(aes(x, y), shape=22, size=.5, alpha = 0.3)
  print(p)
}

test <- microbenchmark::microbenchmark(no_alpha = p1(),
                               alpha = p2(),
                               alpha_circle = p3(),
                               alpha_square = p4(),
                               times = 10)

print(test)

Unit: milliseconds
         expr        min         lq       mean     median         uq       max neval
     no_alpha   837.5163   851.7994  1025.0569   910.3687  1173.8753  1403.087    10
        alpha 41456.3393 41708.0781 45831.6033 42589.4998 45219.8180 59578.347    10
 alpha_circle   429.7718   536.9076   719.5507   549.7952   555.9002  1780.282    10
 alpha_square   800.1380   806.5523   882.0163   815.6232   842.4669  1450.395    10

编辑：

我们可以使用microbenchmark和purrr来查看哪些形状导致最快的绘图时间。

library(purrr)
library(microbenchmark)

df <- tibble(x = rnorm(n = 7810),
             y = rnorm(n = 7810))

s <- tibble(shape = c(0:24))

plot_fun <- function(shape) {
  p <- ggplot(df) +
    geom_point(aes(x, y), 
               shape = shape,
               alpha = 0.3)
  print(p)
}


test_fun <- function(shape) {
  microbenchmark(plot_fun(shape = shape),
                 times = 10)
}

results <- map(s$shape, ~test_fun(shape = .x))

s <- s %>%
  mutate(test = map(.$shape, 
                    ~test_fun(shape = .x)))

s %>% 
  tidyr::unnest(test) %>%
  mutate(time = microbenchmark:::convert_to_unit(time, "ms")) %>%
  ggplot() +
  geom_boxplot(aes(x = shape, y = time, group = shape), outlier.shape = NA) +
  scale_x_continuous(breaks = c(0:24)) +
  scale_y_log10() +
  coord_flip()

似乎形状值0、1和15到22提供了比其余值更快的渲染时间。

使用alpha减少R包ggplot2中的过度绘图时的显着运行时膨胀

问题描述投票：2回答：1

1个回答

最新问题

使用alpha减少R包ggplot2中的过度绘图时的显着运行时膨胀

问题描述 投票：2回答：1

1个回答

最新问题

问题描述投票：2回答：1