R:获取一个值是否在两个日期之间,并进行计数。

问题描述 投票:2回答:1

我在R中有以下数据框架

df1 <- data.frame(id = c(12332, 231231, 123123, 1231231, 123123),
                  date_in = c("04/08/2019 04:00", "04/08/2019 06:00", "05/08/2019 04:00",
                              "08/08/2019 12:00", "12/08/2019 04:00"),
                  date_out = c("20/08/2019 04:00", "14/08/2019 13:00", "11/08/2019 04:00",
                              "30/08/2019 04:00"))

我想创建另一个数据框架,统计每个月每天有多少个ID在哪里。

有什么办法吗?

谢谢你!我在R df1中有以下数据框。

r date postfix-notation
1个回答
1
投票

A tidyverse 办法可以是 。

library(dplyr)

df1 %>%
  #Convert to date
  mutate_at(-1, ~as.Date(lubridate::dmy_hm(.))) %>%
  #Create a sequence of date from date_in to date_out
  mutate(date = purrr::map2(date_in, date_out, seq, by  ="1 day")) %>%
  #Get list dates in long format
  tidyr::unnest(date) %>%
  #Count number of id's in each date
  count(date)

# A tibble: 27 x 2
#   date           n
#   <date>     <int>
# 1 2019-08-04     2
# 2 2019-08-05     3
# 3 2019-08-06     3
# 4 2019-08-07     3
# 5 2019-08-08     4
# 6 2019-08-09     4
# 7 2019-08-10     4
# 8 2019-08-11     4
# 9 2019-08-12     4
#10 2019-08-13     4
# … with 17 more rows

数据

df1 <- structure(list(id = c(12332, 231231, 123123, 1231231, 123123), 
date_in = structure(1:5, .Label = c("04/08/2019 04:00", "04/08/2019 06:00", 
"05/08/2019 04:00", "08/08/2019 12:00", "12/08/2019 04:00"
), class = "factor"), date_out = structure(c(3L, 2L, 1L, 
4L, 4L), .Label = c("11/08/2019 04:00", "14/08/2019 13:00", 
"20/08/2019 04:00", "30/08/2019 04:00"), class = "factor")), 
class = "data.frame", row.names = c(NA, -5L))
© www.soinside.com 2019 - 2024. All rights reserved.