总结重叠日期上的列的平均值

问题描述 投票:0回答:1

我有一个如下所示的数据框:

    Datetime           value date         lat   lon sunrise             sunset             
   <dttm>              <dbl> <date>     <dbl> <dbl> <dttm>              <dttm>             
 1 2021-09-01 00:00:00     9 2021-09-01  36.2 -92.3 2021-09-01 06:42:27 2021-09-01 19:38:56
 2 2021-09-01 01:00:00    22 2021-09-01  36.2 -92.3 2021-09-01 06:42:27 2021-09-01 19:38:56
 3 2021-09-01 02:00:00     0 2021-09-01  36.2 -92.3 2021-09-01 06:42:27 2021-09-01 19:38:56
 4 2021-09-01 03:00:00     9 2021-09-01  36.2 -92.3 2021-09-01 06:42:27 2021-09-01 19:38:56
 5 2021-09-01 04:00:00     9 2021-09-01  36.2 -92.3 2021-09-01 06:42:27 2021-09-01 19:38:56
 6 2021-09-01 05:00:00    35 2021-09-01  36.2 -92.3 2021-09-01 06:42:27 2021-09-01 19:38:56
 7 2021-09-01 06:00:00     9 2021-09-01  36.2 -92.3 2021-09-01 06:42:27 2021-09-01 19:38:56
 8 2021-09-01 20:00:00     0 2021-09-01  36.2 -92.3 2021-09-01 06:42:27 2021-09-01 19:38:56
 9 2021-09-01 21:00:00    48 2021-09-01  36.2 -92.3 2021-09-01 06:42:27 2021-09-01 19:38:56
10 2021-09-01 22:00:00     0 2021-09-01  36.2 -92.3 2021-09-01 06:42:27 2021-09-01 19:38:56
11 2021-09-01 23:00:00     0 2021-09-01  36.2 -92.3 2021-09-01 06:42:27 2021-09-01 19:38:56
12 2021-09-02 00:00:00     0 2021-09-02  36.2 -92.3 2021-09-02 06:43:14 2021-09-02 19:37:31
13 2021-09-02 01:00:00    26 2021-09-02  36.2 -92.3 2021-09-02 06:43:14 2021-09-02 19:37:31
14 2021-09-02 02:00:00    26 2021-09-02  36.2 -92.3 2021-09-02 06:43:14 2021-09-02 19:37:31
15 2021-09-02 03:00:00    45 2021-09-02  36.2 -92.3 2021-09-02 06:43:14 2021-09-02 19:37:31
16 2021-09-02 04:00:00     0 2021-09-02  36.2 -92.3 2021-09-02 06:43:14 2021-09-02 19:37:31
17 2021-09-02 05:00:00    31 2021-09-02  36.2 -92.3 2021-09-02 06:43:14 2021-09-02 19:37:31
18 2021-09-02 06:00:00    84 2021-09-02  36.2 -92.3 2021-09-02 06:43:14 2021-09-02 19:37:31
19 2021-09-02 20:00:00    21 2021-09-02  36.2 -92.3 2021-09-02 06:43:14 2021-09-02 19:37:31
20 2021-09-02 21:00:00     0 2021-09-02  36.2 -92.3 2021-09-02 06:43:14 2021-09-02 19:37:31
structure(list(Datetime = structure(c(1630472400, 1630476000, 1630479600, 
1630483200, 1630486800, 1630490400, 1630494000, 1630544400, 1630548000, 
1630551600, 1630555200, 1630558800, 1630562400, 1630566000, 1630569600, 
1630573200, 1630576800, 1630580400, 1630630800, 1630634400), tzone = "America/Chicago", class = c("POSIXct", 
"POSIXt")), value = c(9, 22, 0, 9, 9, 35, 9, 0, 48, 0, 0, 0, 
26, 26, 45, 0, 31, 84, 21, 0), date = structure(c(18871, 18871, 
18871, 18871, 18871, 18871, 18871, 18871, 18871, 18871, 18871, 
18872, 18872, 18872, 18872, 18872, 18872, 18872, 18872, 18872
), class = "Date"), lat = c(36.224, 36.224, 36.224, 36.224, 36.224, 
36.224, 36.224, 36.224, 36.224, 36.224, 36.224, 36.224, 36.224, 
36.224, 36.224, 36.224, 36.224, 36.224, 36.224, 36.224), lon = c(-92.315, 
-92.315, -92.315, -92.315, -92.315, -92.315, -92.315, -92.315, 
-92.315, -92.315, -92.315, -92.315, -92.315, -92.315, -92.315, 
-92.315, -92.315, -92.315, -92.315, -92.315), sunrise = structure(c(1630496547, 
1630496547, 1630496547, 1630496547, 1630496547, 1630496547, 1630496547, 
1630496547, 1630496547, 1630496547, 1630496547, 1630582994, 1630582994, 
1630582994, 1630582994, 1630582994, 1630582994, 1630582994, 1630582994, 
1630582994), tzone = "America/Chicago", class = c("POSIXct", 
"POSIXt")), sunset = structure(c(1630543136, 1630543136, 1630543136, 
1630543136, 1630543136, 1630543136, 1630543136, 1630543136, 1630543136, 
1630543136, 1630543136, 1630629451, 1630629451, 1630629451, 1630629451, 
1630629451, 1630629451, 1630629451, 1630629451, 1630629451), tzone = "America/Chicago", class = c("POSIXct", 
"POSIXt"))), row.names = c(NA, -20L), class = c("tbl_df", "tbl", 
"data.frame"))

最初数据包含每天的所有 24 小时,但我已将其过滤为仅包含日落之后和日出之前的几个小时。我现在的问题是,我想在同一时间段内取第二列“值”的平均值:日落之后到日出之前,但跨越两个日期的过程。本质上我想要一个晚上的平均值。

有没有一种方法可以进行分组和汇总,以便我可以通过跨越两个日期的日落和日出之间的值来收集列值的平均值?

r date datetime dplyr lubridate
1个回答
0
投票

对于日落总是在 00:00 之前且日出在 12:00 之前的特定纬度,我们可以将

Datetime
移动 12 小时并按结果日期进行分组:

library(dplyr)
library(lubridate)

df %>% 
  group_by(sunset_date = date(Datetime - hours(12))) %>% 
  summarise(dt_start = min(Datetime), dt_end = max(Datetime), mean_value = mean(value))
#> # A tibble: 3 × 4
#>   sunset_date dt_start            dt_end              mean_value
#>   <date>      <dttm>              <dttm>                   <dbl>
#> 1 2021-08-31  2021-09-01 00:00:00 2021-09-01 06:00:00       13.3
#> 2 2021-09-01  2021-09-01 20:00:00 2021-09-02 06:00:00       23.6
#> 3 2021-09-02  2021-09-02 20:00:00 2021-09-02 21:00:00       10.5

创建于 2023-10-10,使用 reprex v2.0.2

© www.soinside.com 2019 - 2024. All rights reserved.