我正在完成课程的分配,需要创建2月7日的前2天的数据子集。
W这是我的代码:
library(sqldf)
# set directory
setwd("C:/Users/thoma/Desktop/Files/Programming/R/EDA/EDAWk1")
# unzip source data
temp <- tempfile()
download.file("https://d396qusza40orc.cloudfront.net/exdata%2Fdata%2Fhousehold_power_consumption.zip", temp)
data <- read.table(unz(temp, "household_power_consumption.txt"), header = TRUE, sep = ';')
# we only want a specified range of dates
data_2 <- sqldf("
select
*
from data
where Date in ('2007-02-01','2007-02-02')
")
中间数据集'data'工作正常,但是我获得的data_2为空。有谁知道为什么会这样?
欢呼。
您可以在基数R中执行此操作:
data_2 <- subset(data, Date %in% as.Date(c('2007-02-01','2007-02-02')))
或使用dplyr
和lubridate
:
library(dplyr)
library(lubridate)
data_2 <- data %>% filter(month(Date) == 2 & day(Date) <= 2)
带有data.table
的选项
library(data.table)
setDT(data)[Date %in% as.IDate(c('2007-02-01','2007-02-02'))]