如何使用日期作为过滤器

Question

我对R和脚本编程的了解几乎不存在。所以我希望你能对这个基本问题保持耐心。

library(lubridate)
date.depature <- c("2016.06.16", "2016.11.16", "2017.01.05", "2017.01.12", "2017.02.25")
airport.departure <- c("CDG", "QNY", "QXO", "CDG", "QNY")
airport.arrival <- c("SYD", "CDG", "QNY", "SYD", "QXO")
amount <- c("1", "3", "1", "10", "5")
date.depature <- as_date(date.depature)
df <- data.frame(date.depature, airport.departure, airport.arrival, amount)

xtabs(as.integer(amount) ~ airport.arrival + airport.departure, df)

使用此代码，我们得到金额的总和作为矩阵，机场为行/列。现在我只需要结果

2017
2017.01
直到2017.01

Answer 1

由于你已经在使用lubridate，我将向你展示一种使用dplyr（tidyverse和lubridate一起使用的一部分）的方法。

解决方案都适用。 filter和month，year和as_date函数从lubridate创建条件来过滤你的数据，然后使用pipe %>%传递那个长到xtabs

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#> 
#>     date

date.depature <- c("2016.06.16", "2016.11.16", "2017.01.05", "2017.01.12", "2017.02.25")
airport.departure <- c("CDG", "QNY", "QXO", "CDG", "QNY")
airport.arrival <- c("SYD", "CDG", "QNY", "SYD", "QXO")
amount <- c("1", "3", "1", "10", "5")
date.depature <- as_date(date.depature)
df <- data.frame(date.depature, airport.departure, airport.arrival, amount)

# For 2017
df %>% 
  filter(year(date.depature) == 2017) %>% 
  xtabs(as.integer(amount) ~ airport.arrival + airport.departure, .)
#>                airport.departure
#> airport.arrival CDG QNY QXO
#>             CDG   0   0   0
#>             QNY   0   0   1
#>             QXO   0   4   0
#>             SYD   2   0   0

# 2017.01
df %>% 
  filter(year(date.depature) == 2017, month(date.depature) == 1) %>% 
  xtabs(as.integer(amount) ~ airport.arrival + airport.departure, .)
#>                airport.departure
#> airport.arrival CDG QNY QXO
#>             CDG   0   0   0
#>             QNY   0   0   1
#>             QXO   0   0   0
#>             SYD   2   0   0

# until 2017.01
df %>% 
  filter(date.depature <= as_date("2017.01.01")) %>% 
  xtabs(as.integer(amount) ~ airport.arrival + airport.departure, .)
#>                airport.departure
#> airport.arrival CDG QNY QXO
#>             CDG   0   3   0
#>             QNY   0   0   0
#>             QXO   0   0   0
#>             SYD   1   0   0

由reprex package创建于2018-11-19（v0.2.1）

Answer 2

你创建amount时为什么不强迫"integer"上课df？只是摆脱双引号

amount <- c("1", "3", "1", "10", "5")

要么

amount <- as.integer(c("1", "3", "1", "10", "5"))

这是因为as.integer(df$amount)没有回来

c(1, 3, 1, 10, 5)

当您创建数据框df时，该向量被强制转换为类"factor"，而您现在拥有的是

as.integer(df$amount)
#[1] 1 3 1 2 4

正确的方法是

as.integer(as.character(df$amount))
#[1]  1  3  1 10  5

或者更简单：

date.depature <- c("2016.06.16", "2016.11.16", "2017.01.05", "2017.01.12", "2017.02.25")
airport.departure <- c("CDG", "QNY", "QXO", "CDG", "QNY")
airport.arrival <- c("SYD", "CDG", "QNY", "SYD", "QXO")
amount <- c(1, 3, 1, 10, 5)
date.depature <- as_date(date.depature)
df <- data.frame(date.depature, airport.departure, airport.arrival, amount)

现在的问题。

这基本上是一个子集问题。子集提取所需年份和月份的数据，然后运行相同的xtabs命令。

df1 <- df[year(df$date.depature) == 2017, ]
df2 <- df1[month(df1$date.depature) == 1, ]
df3 <- cbind(df[year(df$date.depature) < 2017, ], df2)

现在xtabs，上面的子数据帧。

xtabs(amount ~ airport.arrival + airport.departure, df1)
xtabs(amount ~ airport.arrival + airport.departure, df2)
xtabs(amount ~ airport.arrival + airport.departure, df3)

Answer 3

您需要在xtabs调用中对date.departure进行子集化。年== 2017年：

xtabs(as.integer(amount) ~ airport.arrival + airport.departure, df[year(df$date.depature)==2017,])

对于年份== 2017年和月份== 1：

xtabs(as.integer(amount) ~ airport.arrival + airport.departure, df[year(df$date.depature)==2017 & month(df$date.departure)==1,])

2017年1月之前的任何事情：

xtabs(as.integer(amount) ~ airport.arrival + airport.departure, df[df$date.depature<as_date("2017-01-01"),])

如何使用日期作为过滤器

问题描述投票：1回答：3

3个回答

最新问题

如何使用日期作为过滤器

问题描述 投票：1回答：3

3个回答

最新问题

问题描述投票：1回答：3