我想创建一个数据集,使用有向图可视化每日最低、最高和平均温度。我有一个时间序列数据集,其中一列中包含日期(日期时间戳),后续列中包含水质(温度、pH 等)变量。数据每隔 15 分钟报告一次。数据集中存在传感器发生故障的 NA。
数据帧 ldq1 的标头如下所示:
datetimestamp temp
1 2007-01-01 00:00:00 16.4
2 2007-01-01 00:15:00 16.4
3 2007-01-01 00:30:00 16.4
4 2007-01-01 00:45:00 16.4
5 2007-01-01 01:00:00 16.4
6 2007-01-01 01:15:00 16.4
7 2007-01-01 01:30:00 16.3
8 2007-01-01 01:45:00 16.2
9 2007-01-01 02:00:00 16.3
The datetimestamp class:
class(ldq1$datetimestamp)
[1] "POSIXct" "POSIXt"
温度是双倍。
我正在总结每日的最大值、分钟和平均值,如下所示:
l<-ldq1 %>%
+ group_by(date = as.Date(datetimestamp)) %>%
+ summarise(t_min = min(temp, na.rm = TRUE),
+ t_max = max(temp, na.rm = TRUE),
+ t_mean = mean(temp, na.rm = TRUE), .groups = 'drop')
但是输出看起来像这样:
Warning message:
There were 6705 warnings in `summarise()`.
The first warning was:
ℹ In argument: `t_min = min(temp, na.rm = TRUE)`.
ℹ In group 187: `date = 2007-07-06`.
Caused by warning in `min()`:
! no non-missing arguments, returning NA
l
# A tibble: 5,845 × 4
date t_min t_max t_mean
<date> <chr> <chr> <dbl>
1 2007-01-01 15.7 17.1 NA
2 2007-01-02 15.2 16.7 NA
3 2007-01-03 14.7 15.7 NA
4 2007-01-04 15.3 16.1 NA
5 2007-01-05 15.6 16.9 NA
6 2007-01-06 15.9 17.3 NA
7 2007-01-07 16.3 17.8 NA
8 2007-01-08 16.7 17.8 NA
9 2007-01-09 16 17.5 NA
10 2007-01-10 14.7 16.7 NA
我哪里出错了?
正如 @Dave2e 和 @Jon Spring 已经提到的,问题是你的
temp
列被编码为字符变量而不是数字。因此,您只需添加一个步骤即可使用 mutate
和 across
进行此转换。
ldq1 %>%
# Transform temp from char to numeric
mutate(across(temp, ~as.numeric(.x))) %>%
group_by(date = as.Date(datetimestamp)) %>%
summarise(t_min = min(temp, na.rm = TRUE),
t_max = max(temp, na.rm = TRUE),
t_mean = mean(temp, na.rm = TRUE), .groups = 'drop')
# A tibble: 1 × 4
# date t_min t_max t_mean
# <date> <dbl> <dbl> <dbl>
#1 2007-01-01 16.2 16.4 16.4