使用动态方法计算用于时间序列分析的日期变量

Question

在下面的示例中，我想根据

Date_replaced

计算每个客户和产品的变量

Date_current_until

和

Date

。

Date_replaced

反映了某个客户购买的产品被购买另一种产品所取代的时间点。

Date_current_until

反映了产品成为每个客户的当前产品之前的时间点。

library(dplyr)

Customer_ID <- c(rep(1,4), rep(2,5))
Date <- c(seq(as.Date("2024-02-01"), as.Date("2024-02-04"), "days"), seq(as.Date("2024-02-01"), as.Date("2024-02-05"), "days")) 
Goods <- c(rep("Food",3), "Toys", "Newspaper", rep("Food",3), "Toys")
Date_first_purchased <- c(rep("2024-02-01", 3), "2024-02-04", "2024-02-01", rep("2024-02-02", 3), "2024-02-05") %>% as.Date (., format="%Y-%m-%d") 
Date_replaced <- c(rep("2024-02-04", 3), NA, "2024-02-02", rep("2024-02-05", 3), NA) %>% as.Date (., format="%Y-%m-%d") 
Date_current_until <- c(rep("2024-02-04", 4), "2024-02-02", rep("2024-02-05", 4)) %>% as.Date (., format="%Y-%m-%d")

df <- data.frame(Customer_ID, Date, Goods, Date_first_purchased, Date_replaced, Date_current_until)

到目前为止，我在处理时间序列数据时尝试了一些常见的方法，例如，按

ID

对数据进行分组，并在每个

Date

内按

ID

排列它们。我还尝试将

lead ()

与

mutate

结合使用，但没有成功，因为

lag()

和

lead()

似乎只支持固定数字来查找先前或后续条目。然而，这个问题显然需要一个动态的解决方案，但我还没有找到。也许你们中的某个人之前已经解决过类似的问题。任何有关如何执行此操作的建议都将受到赞赏。

Answer 1

library(tidyverse)

df %>%  
  mutate(Date_replaced = if_else(lead(Goods) != Goods,
                                 lead(Date), 
                                 max(Date[Goods == Goods])), 
         Date_current_until = if_else(row_number() == n(), 
                                      Date, 
                                      if_else(lead(Goods) != Goods,  
                                              lead(Date), 
                                              max(Date[Goods == Goods]))), 
         .by = Customer_ID) 

# A tibble: 9 × 6
  Customer_ID Date       Goods     Date_first_purchased Date_replaced Date_current_until
        <dbl> <date>     <chr>     <date>               <date>        <date>            
1           1 2024-02-01 Food      2024-02-01           2024-02-04    2024-02-04        
2           1 2024-02-02 Food      2024-02-01           2024-02-04    2024-02-04        
3           1 2024-02-03 Food      2024-02-01           2024-02-04    2024-02-04        
4           1 2024-02-04 Toys      2024-02-04           NA            2024-02-04        
5           2 2024-02-01 Newspaper 2024-02-01           2024-02-02    2024-02-02        
6           2 2024-02-02 Food      2024-02-02           2024-02-05    2024-02-05        
7           2 2024-02-03 Food      2024-02-02           2024-02-05    2024-02-05        
8           2 2024-02-04 Food      2024-02-02           2024-02-05    2024-02-05        
9           2 2024-02-05 Toys      2024-02-05           NA            2024-02-05

使用动态方法计算用于时间序列分析的日期变量

问题描述投票：0回答：1

1个回答

最新问题

使用动态方法计算用于时间序列分析的日期变量

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1