我有代表不同年份的 data.frames。每年都有一个日期列。我想在每个变量中创建一个变量,它将一年的前 7 天、后 7 天等分组。所以“2020-01-17”将在“01-15 到 01-21”中
#sample Data
x1 <- data.frame(
day=c("2020-02-21" ,"2020-01-19" ,"2020-01-30" ,"2020-01-17" ,"2020-02-18" ,"2020-02-31", "2020-02-21" ,"2020-01-02" ,"2020-01-28", "2020-02-27" ,"2020-02-29","2020-02-11" ,"2020-01-05", "2020-02-06", "2020-02-10", "2020-01-31" ,"2020-02-18"),
one = 1 )
x2 <- data.frame(
day=c("2021-02-21" ,"2021-01-19" ,"2021-01-30" ,"2021-01-17" ,"2021-02-18" ,"2021-02-31", "2021-02-21" ,"2021-01-02" ,"2021-01-28", "2021-02-27" ,"2021-02-29","2021-02-11" ,"2021-01-05", "2021-02-06", "2021-02-10", "2021-01-31" ,"2021-02-18"),
one = 1 )
我知道如何将天数转换为周数,但是如果我按照建议使用
format
删除年份(从 R 中的日期中删除年份)它会生成一个字符,然后我不能使用cut
。
x2$day <- as.Date( x2$day , "%Y-%m-%d")
x1$day <- as.Date( x1$day , "%Y-%m-%d")
x1$day2 <- format( x1$day , "%m-%d")
class( x1$day2)
如果我不删除年份,那么同一日期会在不同的星期结束。
using the cut function "2020-02-21", and "2021-02-21" are in different weeks. I want them to be in the same bin
cut(as.Date(x2$day), breaks="week")
cut(as.Date(x1$day), breaks="week")
如果这包括闰年和非闰年,那将不起作用,因为一年中的第 i 天可能属于不同的日期范围。我们能做的就是根据组号给组命名。让我们将一年的前 14 天编号为 0,接下来的 14 天编号为 1,依此类推。
transform(x1, g = as.POSIXlt(day, format = "%Y-%m-%d")$yday %/% 14)
给予:
day one g
1 2020-02-21 1 3
2 2020-01-19 1 1
3 2020-01-30 1 2
4 2020-01-17 1 1
5 2020-02-18 1 3
6 2020-02-31 1 NA
7 2020-02-21 1 3
8 2020-01-02 1 0
9 2020-01-28 1 1
10 2020-02-27 1 4
11 2020-02-29 1 4
12 2020-02-11 1 2
13 2020-01-05 1 0
14 2020-02-06 1 2
15 2020-02-10 1 2
16 2020-01-31 1 2
17 2020-02-18 1 3
你和
cut()
走在正确的道路上。
使用跨越全年的每周序列,可以使用 findInterval()
找到与每个日期匹配的序列索引(即周数)。
library(dplyr)
library(lubridate)
x1 <- data.frame(
day=ymd(c("2020-02-21" ,"2020-01-19" ,"2020-01-30" ,"2020-01-17" ,"2020-02-18" ,"2020-02-31", "2020-02-21" ,"2020-01-02" ,"2020-01-28", "2020-02-27" ,"2020-02-29","2020-02-11" ,"2020-01-05", "2020-02-06", "2020-02-10", "2020-01-31" ,"2020-02-18")),
one = 1 )
#> Warning: 1 failed to parse.
x2 <- data.frame(
day=ymd(c("2021-02-21" ,"2021-01-19" ,"2021-01-30" ,"2021-01-17" ,"2021-02-18" ,"2021-02-31", "2021-02-21" ,"2021-01-02" ,"2021-01-28", "2021-02-27" ,"2021-02-29","2021-02-11" ,"2021-01-05", "2021-02-06", "2021-02-10", "2021-01-31" ,"2021-02-18")),
one = 1 )
#> Warning: 2 failed to parse.
x1seq <- seq(dmy("01-01-2020"), dmy("31-12-2020"), by = "7 days")
x2seq <- seq(dmy("01-01-2021"), dmy("31-12-2021"), by = "7 days")
x1 %>%
mutate(week_start = x1seq[findInterval(day, x1seq,
rightmost.closed = FALSE,
left.open = FALSE,
all.inside = FALSE)])
#> day one week_start
#> 1 2020-02-21 1 2020-02-19
#> 2 2020-01-19 1 2020-01-15
#> 3 2020-01-30 1 2020-01-29
#> 4 2020-01-17 1 2020-01-15
#> 5 2020-02-18 1 2020-02-12
#> 6 <NA> 1 <NA>
#> 7 2020-02-21 1 2020-02-19
#> 8 2020-01-02 1 2020-01-01
#> 9 2020-01-28 1 2020-01-22
#> 10 2020-02-27 1 2020-02-26
#> 11 2020-02-29 1 2020-02-26
#> 12 2020-02-11 1 2020-02-05
#> 13 2020-01-05 1 2020-01-01
#> 14 2020-02-06 1 2020-02-05
#> 15 2020-02-10 1 2020-02-05
#> 16 2020-01-31 1 2020-01-29
#> 17 2020-02-18 1 2020-02-12
x2 %>%
mutate(week_start = x2seq[findInterval(day, x2seq,
rightmost.closed = FALSE,
left.open = FALSE,
all.inside = FALSE)])
#> day one week_start
#> 1 2021-02-21 1 2021-02-19
#> 2 2021-01-19 1 2021-01-15
#> 3 2021-01-30 1 2021-01-29
#> 4 2021-01-17 1 2021-01-15
#> 5 2021-02-18 1 2021-02-12
#> 6 <NA> 1 <NA>
#> 7 2021-02-21 1 2021-02-19
#> 8 2021-01-02 1 2021-01-01
#> 9 2021-01-28 1 2021-01-22
#> 10 2021-02-27 1 2021-02-26
#> 11 <NA> 1 <NA>
#> 12 2021-02-11 1 2021-02-05
#> 13 2021-01-05 1 2021-01-01
#> 14 2021-02-06 1 2021-02-05
#> 15 2021-02-10 1 2021-02-05
#> 16 2021-01-31 1 2021-01-29
#> 17 2021-02-18 1 2021-02-12
创建于 2023-05-17 与 reprex v2.0.2