使用tidyverse计算商店中从00:00到24:00的每小时的买家数量

问题描述 投票:0回答:1

我需要在一天中的每个小时计算商店中的买家数量。我已经从另一个类似问题中复制了数据,但似乎没有回答我正在寻找的问题。我不想计算商店的停留时间,而是想计算商店的占用率,通过计算商店中的所有买家,在一天中的每个小时。我只需要使用tidyverse和lubridate来做这件事。

df <- structure(list(ID = c(101, 102, 103, 104, 105, 106, 107), 
                     Time_in = structure(c(1326309720, 1326309900, 1328990700, 
                                        1328997240, 1329000840, 1329004440, 
                                        1329004680), 
                    class = c("POSIXct", "POSIXt"), tzone = ""),  
                    Time_out = structure(c(1326313800, 1326317340, 1326317460, 
                                        1326324660, 1326328260, 1326335460, 
                                        1326335460), 
                    class = c("POSIXct", "POSIXt"), tzone = "")), .Names = 
                            c("ID", "Adm", "Disc"), 
                    row.names = c(NA, -7L), class = "data.frame")
r datetime tidyverse calculation
1个回答
0
投票

假设Adm和Disc是他们在商店中执行的动作。

使用年月日工作计数可以将其缩放到您想要的任何年份。

df <- structure(list(ID = c(101, 102, 103, 104, 105, 106, 107), 
                     Adm = structure(c(1326309720, 1326309900, 1328990700, 
                                       1328997240, 1329000840, 1329004440, 
                                       1329004680), 
                                     class = c("POSIXct", "POSIXt"), tzone = ""),  
                     Disc = structure(c(1326313800, 1326317340, 1326317460, 
                                        1326324660, 1326328260, 1326335460, 
                                        1326335460), 
                                      class = c("POSIXct", "POSIXt"), tzone = "")), .Names = 
                  c("ID", "Adm", "Disc"), 
                row.names = c(NA, -7L), class = "data.frame")

library(tidyverse)
library(lubridate)
#> 
#> Attachement du package : 'lubridate'
#> The following object is masked from 'package:base':
#> 
#>     date
by_hours <- df %>%
  gather(key = Type, Time, 2:3)

by_hours
#>     ID Type                Time
#> 1  101  Adm 2012-01-11 20:22:00
#> 2  102  Adm 2012-01-11 20:25:00
#> 3  103  Adm 2012-02-11 21:05:00
#> 4  104  Adm 2012-02-11 22:54:00
#> 5  105  Adm 2012-02-11 23:54:00
#> 6  106  Adm 2012-02-12 00:54:00
#> 7  107  Adm 2012-02-12 00:58:00
#> 8  101 Disc 2012-01-11 21:30:00
#> 9  102 Disc 2012-01-11 22:29:00
#> 10 103 Disc 2012-01-11 22:31:00
#> 11 104 Disc 2012-01-12 00:31:00
#> 12 105 Disc 2012-01-12 01:31:00
#> 13 106 Disc 2012-01-12 03:31:00
#> 14 107 Disc 2012-01-12 03:31:00

by_hours %>%
  mutate( 
    Time = ymd_hms(Time),
    year = year(Time),
    month = month(Time),
    day = day(Time),
    hour = hour(Time),
  ) %>% 
  count(year, month, day, hour)
#> # A tibble: 10 x 5
#>     year month   day  hour     n
#>    <dbl> <dbl> <int> <int> <int>
#>  1  2012     1    11    20     2
#>  2  2012     1    11    21     1
#>  3  2012     1    11    22     2
#>  4  2012     1    12     0     1
#>  5  2012     1    12     1     1
#>  6  2012     1    12     3     2
#>  7  2012     2    11    21     1
#>  8  2012     2    11    22     1
#>  9  2012     2    11    23     1
#> 10  2012     2    12     0     2

reprex package创建于2018-07-17(v0.2.0)。

© www.soinside.com 2019 - 2024. All rights reserved.