如何使用 if_else 输出小标题，以及如何使用 mutate 输出列表列？

Question

我正在尝试从时间序列中推断缺失值。我对 tidyverse 很满意，但似乎遇到了一些障碍，我不确定我是否应该采取不同的方法。

我有一个时间间隔不均匀的数据框。如果测量之间的时间超过某个阈值，我想在上下最近的时间之间进行线性外推，测量值均匀分布。像这样的东西：

df <- tibble(
  time = c(1, 2, 3, 8, 9), 
  meas = c(200, 300, 320, 500, 600)
);

avg_meas_interval <- median(df$time - lag(df$time);

df %>% mutate(
  if_else(
    time - lag(time) > threshold,
    tibble(
      time_extrp = seq(lag(time), time, avg_meas_interval),
      meas_extrp = seq(lag(meas), meas, length.out = length(time_extrp))
    ),
    NA
  )
)

然后我会有一列 tibbles 表示应该在给定行之前插入的外推行，如果没有要插入的行，则为 NA。我可以拉这个和 row_bind.

现在，我在使用这种方法时遇到了一些问题：

if_else 不喜欢返回 tibbles，因为它希望 if-else 的可能结果具有相同的长度。
Mutate 不喜欢输出 tibbles 列表。我可以通过使用 map 来解决这个问题，将一个单独的列表输出到一个新变量（这样它就不是 tibble 的一部分），然后进行行绑定，但这似乎有点迂回。

看起来像这样的简单外推，输出大小取决于输入数据，很难以纯函数式风格实现。不确定是否有比仅在程序 R 或 python 中实现更好的方法。

Answer 1

也许是这个？

df %>%
  mutate(
    across(c(time, meas), lag, .names = "lag_{.col}"),
    intrvl = c(1, rep(median(diff(time)), n() - 1)), # I'm inferring here ...
    extrp = purrr::pmap(
      list(time, lag_time, meas, lag_meas, intrvl),
      ~ if (!is.na(..2) && (..1 - ..2) > threshold) {
         tibble(time_extrp = seq(..2, ..1, length.out=..5),
                meas_extrp = seq(..4, ..3, length.out=..5))
      })
  )
# # A tibble: 5 × 6
#    time  meas lag_time lag_meas intrvl extrp           
#   <dbl> <dbl>    <dbl>    <dbl>  <dbl> <list>          
# 1     1   200       NA       NA      1 <NULL>          
# 2     2   300        1      200      1 <NULL>          
# 3     3   320        2      300      1 <NULL>          
# 4     8   500        3      320      1 <tibble [1 × 2]>
# 5     9   600        8      500      1 <NULL>

如何使用 if_else 输出小标题，以及如何使用 mutate 输出列表列？

问题描述投票：0回答：1

1个回答

最新问题

如何使用 if_else 输出小标题，以及如何使用 mutate 输出列表列？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1