我有一个函数,使用移动窗口计算中位数和90%CI。所以对于每个x = seq(xmin, xmax, by = wStep)
,我返回所有y
的中位数和5%和95%分位数,其x
值小于wSize/2
。我希望通过创建自定义平滑函数stat_movingwindow()
,使用ggplot2将其显示为线条和功能区。我可以使用geom_smooth(data = ..., stat = "identity")
创建我想要的结果:
moveWin <- function(d, wSize = 0.5, wStep = 0.1,
f = function(x) quantile(x, prob = c(0.05,0.50,0.95), na.rm = TRUE)
){
x <- seq(min(d$x), max(d$x), by = wStep)
y <- matrix(NA, ncol = 3, nrow = length(x))
for(i in seq_along(x)){
y[i, ] <- f(d[abs(d$x - x[i]) < wSize/2, ]$y)
}
y <- as.tibble(y)
colnames(y) <- c("ymin","y","ymax")
y$x <- x
return(as.tibble(y))
}
set.seed(123)
d <- tibble(
x= sqrt(seq(0,1,length.out = 50))*10,
y= rnorm(50)
)
ggplot(data = d) + aes(x = x, y = y) +
geom_smooth(
data = function(d) moveWin(d, wSize = 1, wStep = 0.1),
mapping = aes(ymin = ymin, ymax= ymax),
stat = "identity") +
geom_point() + scale_x_continuous(breaks = 1:10)
继Vignette Extending ggplot2之后,这是我到目前为止提出的代码。但是,问题是这不显示功能区。也许我需要一些方法来声明这个自定义统计提供美学ymin和ymax。如何获得以下代码以输出与上面类似的结果?
StatMovingWindow <- ggproto("StatMovingWindow", Stat,
compute_group = function(data, scales, wSize, wStep, fun) {
moveWin(data, wSize = wSize, wStep = wStep, f = fun)
},
required_aes = c("x", "y")
)
stat_movingwindow <- function(mapping = NULL, data = NULL,
fun = function(d) quantile(d, probs = c(0.05, 0.50, 0.95), na.rm = TRUE),
wStep = 0.1, wSize = 1,
geom = "smooth", position = "identity", show.legend = NA, inherit.aes = TRUE,
...
){
layer(
stat = StatMovingWindow, data = data, mapping = mapping, geom = geom,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = list(wStep = wStep, wSize = wSize, fun = fun, ...)
)
}
ggplot(data = d) + aes(x = x, y = y) +
stat_movingwindow(wStep = 0.1, wSize = 1) +
geom_point() + scale_x_continuous(breaks = 1:10)
在stat_movingwindow
的代码中,相应geom的行是geom = "smooth"
:
stat_movingwindow <- function(mapping = NULL, data = NULL,
fun = function(d) quantile(d, probs = c(0.05, 0.50, 0.95), na.rm = TRUE),
wStep = 0.1, wSize = 1,
geom = "smooth", # <- look here
position = "identity", show.legend = NA, inherit.aes = TRUE,
...
){
layer(
stat = StatMovingWindow, data = data, mapping = mapping, geom = geom,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = list(wStep = wStep, wSize = wSize, fun = fun, ...)
)
}
检查geom_smooth
的代码,我们看到它包含参数se = TRUE
,并使用GeomSmooth
作为其geom:
> geom_smooth
function (mapping = NULL, data = NULL, stat = "smooth", position = "identity",
..., method = "auto", formula = y ~ x, se = TRUE, # <- look here
na.rm = FALSE,
show.legend = NA, inherit.aes = TRUE)
{
params <- list(na.rm = na.rm, se = se, ...)
if (identical(stat, "smooth")) {
params$method <- method
params$formula <- formula
}
layer(data = data, mapping = mapping, stat = stat, geom = GeomSmooth, # <- and here
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = params)
}
深入挖掘GeomSmooth,我们看到它的draw_group
函数(负责绘制平滑线)将se = FALSE
作为其默认参数。
从代码中,如果se == FALSE
,has_ribbon
也将是FALSE
,即使你的数据中存在ymax
和ymin
,你还要感谢StatMovingWindow$compute_group
函数。而这反过来意味着GeomLine$draw_panel(path, panel_params, coord)
的唯一结果将单独返回,没有GeomRibbon$draw_group(ribbon, panel_params, coord)
。
> GeomSmooth$draw_group
<ggproto method>
<Wrapper function>
function (...)
f(...)
<Inner function (f)>
function (data, panel_params, coord, se = FALSE) # <- look here
{
ribbon <- transform(data, colour = NA)
path <- transform(data, alpha = NA)
has_ribbon <- se && !is.null(data$ymax) && !is.null(data$ymin) # <- and here
gList(if (has_ribbon) GeomRibbon$draw_group(ribbon, panel_params, coord),
GeomLine$draw_panel(path, panel_params, coord))
}
简而言之,geom_smooth
的se = TRUE
的默认参数会覆盖GeomSmooth$draw_group
中的默认行为(同样适用于stat_smooth
),如果我们想要获得相同的结果,我们应该在stat_movingwindow
中执行相同的操作。
如果您希望通常需要绘制功能区,则可以在se = TRUE
的定义中包含stat_movingwindow
作为参数。如果它是临时的,您可以在代码中随时包含它。