在 R 中按周分配日到月

问题描述 投票:0回答:2

我有一个奇怪的问题。

有一个带有日期的数据框:

weeks_frame <- structure(list(start_week = c("02-01-2023", "09-01-2023", "16-01-2023", 
"23-01-2023", "30-01-2023", "06-02-2023", "13-02-2023", "20-02-2023", 
"27-02-2023", "06-03-2023", "13-03-2023", "20-03-2023", "27-03-2023", 
"03-04-2023", "10-04-2023", "17-04-2023", "24-04-2023", "01-05-2023", 
"08-05-2023", "15-05-2023", "22-05-2023", "29-05-2023", "05-06-2023", 
"12-06-2023", "19-06-2023", "26-06-2023", "03-07-2023", "10-07-2023", 
"17-07-2023", "24-07-2023", "31-07-2023", "07-08-2023", "14-08-2023", 
"21-08-2023", "28-08-2023", "04-09-2023", "11-09-2023", "18-09-2023", 
"25-09-2023", "02-10-2023", "09-10-2023", "16-10-2023", "23-10-2023", 
"30-10-2023", "06-11-2023", "13-11-2023", "20-11-2023", "27-11-2023", 
"04-12-2023", "11-12-2023", "18-12-2023", "25-12-2023"), end_week = c("01-01-2023", 
"08-01-2023", "15-01-2023", "22-01-2023", "29-01-2023", "05-02-2023", 
"12-02-2023", "19-02-2023", "26-02-2023", "05-03-2023", "12-03-2023", 
"19-03-2023", "26-03-2023", "02-04-2023", "09-04-2023", "16-04-2023", 
"23-04-2023", "30-04-2023", "07-05-2023", "14-05-2023", "21-05-2023", 
"28-05-2023", "04-06-2023", "11-06-2023", "18-06-2023", "25-06-2023", 
"02-07-2023", "09-07-2023", "16-07-2023", "23-07-2023", "30-07-2023", 
"06-08-2023", "13-08-2023", "20-08-2023", "27-08-2023", "03-09-2023", 
"10-09-2023", "17-09-2023", "24-09-2023", "01-10-2023", "08-10-2023", 
"15-10-2023", "22-10-2023", "29-10-2023", "05-11-2023", "12-11-2023", 
"19-11-2023", "26-11-2023", "03-12-2023", "10-12-2023", "17-12-2023", 
"31-12-2023")), class = "data.frame", row.names = c(NA, -52L))

在我的函数中,有以下代码行:

lapply(1:nrow(weeks_frame), function(x) {
    format(
      seq(as.Date(weeks_frame$start_week[x],format="%d-%m-%Y"),
               as.Date(weeks_frame$end_week[x],format="%d-%m-%Y"),by ='days')
    ,"%b")
    })

它将每天分配给一个月(我需要将周分配给两个月中的一个) 但它只有效几年。有些则不然。 我不得不说,我不明白其中的原因。

**

我收到的错误消息是:“seq.int(0,to0 - from, by) 中的错误: “by”参数中的符号错误。

**

这里出了什么问题? 为什么只在某些年份而在其他年份完全没问题。 即使数据帧结构保持不变。

干杯

r date sequence
2个回答
0
投票

处理

start_week
end_week
之后的情况的一个选项是通过
lubridate
区间函数:
interval()
不介意负区间,
int_standardize()
翻转区间起点和终点(如果需要)和
int_start()
/
int_end()
可用于访问开始和结束。生成的
months
列是列表列:


library(dplyr)
library(lubridate)

weeks_frame |>
  mutate(int = interval(dmy(start_week), dmy(end_week)) |> int_standardize()) |>
  rowwise() |>
  mutate(months = seq(int_start(int), int_end(int), by = "days") |> strftime("%b") |> list()) |> 
  ungroup() |>
  as.data.frame() |>
  head()
#>   start_week   end_week                            int       months
#> 1 02-01-2023 01-01-2023 2023-01-01 UTC--2023-01-02 UTC   jaan, jaan
#> 2 09-01-2023 08-01-2023 2023-01-08 UTC--2023-01-09 UTC   jaan, jaan
#> 3 16-01-2023 15-01-2023 2023-01-15 UTC--2023-01-16 UTC   jaan, jaan
#> 4 23-01-2023 22-01-2023 2023-01-22 UTC--2023-01-23 UTC   jaan, jaan
#> 5 30-01-2023 29-01-2023 2023-01-29 UTC--2023-01-30 UTC   jaan, jaan
#> 6 06-02-2023 05-02-2023 2023-02-05 UTC--2023-02-06 UTC veebr, veebr

0
投票

[编辑: 只是补充一下,如果下面的输出确实是您想要的,并且您的星期基本上只是一个日历,您可以完全放弃 data.frame 并执行例如

"2023-01-02" |> as.Date() |> seq(by=1, length.out=52 * 7) |> format("%b") |> matrix(ncol=7, byrow=T) |> list()
]

调整你自己的基本 R 代码,假设一周的开始是正确的:

lapply(
  as.Date(weeks_frame$start_week, format="%d-%m-%Y"),   # starts of weeks as Dates
  function(d)  format(seq(d, length.out=7, by=1), "%b") # months of each day in each week
  )

输出片段:

[[4]]
[1] "Jan" "Jan" "Jan" "Jan" "Jan" "Jan" "Jan"

[[5]]
[1] "Jan" "Jan" "Feb" "Feb" "Feb" "Feb" "Feb"

[[6]]
[1] "Feb" "Feb" "Feb" "Feb" "Feb" "Feb" "Feb"
© www.soinside.com 2019 - 2024. All rights reserved.