使用lead()和可能的循环添加新变量

问题描述 投票:0回答:1

我有一个名为loc_prime2的数据集,看起来像这样:

Document.Name   locale                  Arrival     Leg.Number  no_legs
VCH028735       DENVER_COLORADO         12/2/2018   1           2
VCH028735       _NONE                   12/7/2018   2           2
VCH028776       HARLINGEN_TEXAS         12/2/2018   1           3
VCH028776       LUBBOCK_TEXAS           12/3/2018   2           3
VCH028776       NONE                    12/4/2018   3           3
VCH030440       MEMPHIS_TENNESSEE       5/12/2019   1           6
VCH030440       NASHVILLE_TENNESSEE     5/13/2019   2           6
VCH030440       KNOXVILLE_TENNESSEE     5/14/2019   3           6
VCH030440       CHATTANOOGA_TENNESSEE   5/15/2019   4           6
VCH030440       NASHVILLE_TENNESSEE     5/16/2019   5           6
VCH030440       Kennesaw,               5/18/2019   6           6
VCH031580       EUGENE_OREGON           7/8/2019    1           8
VCH031580       NEWPORT_OREGON          7/9/2019    2           8
VCH031580       CORVALLIS_OREGON        7/10/2019   3           8
VCH031580       EUGENE_OREGON           7/11/2019   4           8
VCH031580       EUREKA_CALIFORNIA       7/12/2019   5           8
VCH031580       REDDING_CALIFORNIA      7/15/2019   6           8
VCH031580       SACRAMENTO_CALIFORNIA   7/16/2019   7           8
VCH031580       _NONE                   7/17/2019   8           8

我想添加一个新列,该列包含当前到达日期之后的到达日期。根据行程中的no_legs,此操作需要执行不同的次数。例如,第一个Document.Name位于12/2的丹佛;与Document.Name关联的下一个位置是_None,表示在丹佛之后没有目的地。因此,VCH028735的行应压缩为:

Document.Name    locale            Arrival    End        
VCH028735        DENVER_COLORADO   12/2/2018  12/7/2018  

请注意,某些行程有两条以上的腿。行程多达8条腿。例如,VCH031580需要压缩为:

 Document.Name    locale                  Arrival    End        
 VCH031580        EUGENE_OREGON           7/8/2019  7/9/2019
 VCH031580        NEWPORT_OREGON          7/9/2019  7/10/2019
 VCH031580        CORVALLIS_OREGON        7/10/2019  7/11/2019
 VCH031580        EUGENE_OREGON           7/11/2019  7/12/2019
 VCH031580        EUREKA_CALIFORNIA       7/12/2019  7/15/2019
 VCH031580        REDDING_CALIFORNIA      7/15/2019  7/16/2019
 VCH031580        SACRAMENTO_CALIFORNIA   7/16/2019  7/17/2019

我的no_legs为2的情况是这样的]

test <- as.data.frame(loc_prime2 %>% group_by(Document.Name) %>% mutate(
    end1 = as.Date(ifelse(Leg.Number == 1 & no_legs == 2, lead(Arrival), 0), 
    origin = '1970-01-01')

    # end mutate
    ) 
)

但是要处理不同的no_legs值,我想我需要一个循环之类的东西。我敢肯定,有一种很简单的方法可以做我想做的事-我只是看不到它。有想法吗?

提前感谢。

r
1个回答
0
投票

[我认为您通过考虑每组的腿数来增加难度。假设您的到达日期是按时间顺序排序的,那么您所需要做的就是按Document.Name分组,然后使用lead创建新的end变量。然后,您只需删除所有最后的行(对于NA,将有一个end

library(dplyr)

loc_prime2 %>% 
  group_by(Document.Name) %>% 
  mutate(End = lead(Arrival)) %>% 
  select(Document.Name, locale, Arrival, End, Leg.Number) %>%
  filter(!is.na(End))

#> # A tibble: 15 x 5
#> # Groups:   Document.Name [4]
#>    Document.Name locale                Arrival   End       Leg.Number
#>    <chr>         <chr>                 <chr>     <chr>          <int>
#>  1 VCH028735     DENVER_COLORADO       12/2/2018 12/7/2018          1
#>  2 VCH028776     HARLINGEN_TEXAS       12/2/2018 12/3/2018          1
#>  3 VCH028776     LUBBOCK_TEXAS         12/3/2018 12/4/2018          2
#>  4 VCH030440     MEMPHIS_TENNESSEE     5/12/2019 5/13/2019          1
#>  5 VCH030440     NASHVILLE_TENNESSEE   5/13/2019 5/14/2019          2
#>  6 VCH030440     KNOXVILLE_TENNESSEE   5/14/2019 5/15/2019          3
#>  7 VCH030440     CHATTANOOGA_TENNESSEE 5/15/2019 5/16/2019          4
#>  8 VCH030440     NASHVILLE_TENNESSEE   5/16/2019 5/18/2019          5
#>  9 VCH031580     EUGENE_OREGON         7/8/2019  7/9/2019           1
#> 10 VCH031580     NEWPORT_OREGON        7/9/2019  7/10/2019          2
#> 11 VCH031580     CORVALLIS_OREGON      7/10/2019 7/11/2019          3
#> 12 VCH031580     EUGENE_OREGON         7/11/2019 7/12/2019          4
#> 13 VCH031580     EUREKA_CALIFORNIA     7/12/2019 7/15/2019          5
#> 14 VCH031580     REDDING_CALIFORNIA    7/15/2019 7/16/2019          6
#> 15 VCH031580     SACRAMENTO_CALIFORNIA 7/16/2019 7/17/2019          7
© www.soinside.com 2019 - 2024. All rights reserved.