从不同行中减去值

问题描述 投票:0回答:5

我一直在尝试将计算从excel转移到R,只是想知道是否有一种方法可以复制IF(有步骤)。

所以我的数据在下面,而我用简单公式在excel中得到的结果在DIFF列(=IF(A2=A3, (C2-B3) * 24, 0))中>

NO  T_DATE              L_DATE              DIFF
AAA 10/08/2019 17:02:00 10/08/2019 20:35:00 5.83
AAA 10/08/2019 14:45:00 10/08/2019 15:10:00 11.78
AAA 10/08/2019 03:23:00 10/08/2019 10:25:00 17.32
AAA 09/08/2019 17:06:00 10/08/2019 01:11:00 25.70
AAA 08/08/2019 23:29:00 09/08/2019 10:27:00 0
BBB 08/08/2019 09:34:00 08/08/2019 21:19:00 22.23
BBB 07/08/2019 23:05:00 08/08/2019 06:09:00 18.03
BBB 07/08/2019 12:07:00 07/08/2019 20:25:00 22.32
BBB 06/08/2019 22:06:00 07/08/2019 08:53:00 22.77
BBB 06/08/2019 10:07:00 06/08/2019 19:44:00 0

运气不好,我一直在尝试R。获取数据框的代码如下:

library(data.table)
library(lubridate)

NO <- c("AAA", "AAA", "AAA", "AAA", "AAA", "BBB", "BBB", "BBB", "BBB", "BBB")
T_DATE <- c( "10/08/2019 17:02:00",  "10/08/2019 14:45:00", "10/08/2019 03:23:00",  "09/08/2019 17:06:00", "08/08/2019 23:29:00",  "08/08/2019 09:34:00", "07/08/2019 23:05:00", "07/08/2019 12:07:00", "06/08/2019 22:06:00", "06/08/2019 10:07:00")

L_DATE <- c( "10/08/2019 20:35:00", "10/08/2019 15:10:00","10/08/2019 10:25:00", "10/08/2019 01:11:00","09/08/2019 10:27:00", "08/08/2019 21:19:00","08/08/2019 06:09:00","07/08/2019 20:25:00", "07/08/2019 08:53:00", "06/08/2019 19:44:00")

df <- data.frame(NO, T_DATE, L_DATE)

rm(DIFF,L_DATE,NO,T_DATE)

我不知道如何在使用第一行的L_DATE和随后的T_Date(第2行)的情况下添加该步骤,因此如果两个NO相同,则第一个计算为10/08/2019 20:35:00 - 10/08/2019 14:45:00

我一直在尝试将计算从excel转移到R,只是想知道是否有一种方法可以复制IF(有步骤)。所以我的数据在下面,而我用简单公式得出的结果就不错了...

r
5个回答
5
投票
library(dplyr)
library(lubridate)

df <- data.frame(
  NO = c("AAA", "AAA", "AAA", "AAA", "AAA", "BBB", "BBB", "BBB", "BBB", "BBB"),
  T_DATE = dmy_hms(c( "10/08/2019 17:02:00",  "10/08/2019 14:45:00", "10/08/2019 03:23:00",  "09/08/2019 17:06:00", "08/08/2019 23:29:00",  "08/08/2019 09:34:00", "07/08/2019 23:05:00", "07/08/2019 12:07:00", "06/08/2019 22:06:00", "06/08/2019 10:07:00")),
  L_DATE = dmy_hms(c( "10/08/2019 20:35:00", "10/08/2019 15:10:00","10/08/2019 10:25:00", "10/08/2019 01:11:00","09/08/2019 10:27:00", "08/08/2019 21:19:00","08/08/2019 06:09:00","07/08/2019 20:25:00", "07/08/2019 08:53:00", "06/08/2019 19:44:00"))
)

df %>% 
  group_by(NO) %>% 
  mutate(DIFF = difftime(L_DATE, lead(L_DATE), units = "hours"))
#> # A tibble: 10 x 4
#> # Groups:   NO [2]
#>    NO    T_DATE              L_DATE              DIFF           
#>    <fct> <dttm>              <dttm>              <drtn>         
#>  1 AAA   2019-08-10 17:02:00 2019-08-10 20:35:00  5.416667 hours
#>  2 AAA   2019-08-10 14:45:00 2019-08-10 15:10:00  4.750000 hours
#>  3 AAA   2019-08-10 03:23:00 2019-08-10 10:25:00  9.233333 hours
#>  4 AAA   2019-08-09 17:06:00 2019-08-10 01:11:00 14.733333 hours
#>  5 AAA   2019-08-08 23:29:00 2019-08-09 10:27:00        NA hours
#>  6 BBB   2019-08-08 09:34:00 2019-08-08 21:19:00 15.166667 hours
#>  7 BBB   2019-08-07 23:05:00 2019-08-08 06:09:00  9.733333 hours
#>  8 BBB   2019-08-07 12:07:00 2019-08-07 20:25:00 11.533333 hours
#>  9 BBB   2019-08-06 22:06:00 2019-08-07 08:53:00 13.150000 hours
#> 10 BBB   2019-08-06 10:07:00 2019-08-06 19:44:00        NA hours

0
投票

difftime的替代方法,您可以使用ifelse


0
投票

使用tidyverse


0
投票

base


-1
投票
好像您要从此处获得超前/滞后功能之一:https://dplyr.tidyverse.org/reference/lead-lag.html
© www.soinside.com 2019 - 2024. All rights reserved.