我有一个包含日期和时间变量的数据集。我创建了一个新变量(从时间变异)并称之为“time.of.day”。我想根据时间段分配不同的标签(实际上是4个)。我正在尝试以下方法:
levels(df$time.of.day) <- list(
label_1 = df$time.of.day[df$time >= "07:00:00" & df$time <= "10:00:00"],
label_2 = df$time.of.day[df$time >= "10:00:00" & df$time <= "16:00:00"],
label_3 = df$time.of.day[df$time >= "16:00:00" & df$time <= "19:00:00"],
label_4 = df$time.of.day[df$time >= "19:00:00" & df$time <= "23:59:59"]
)
但没有任何反应,我没有任何错误或警告。
以下是上述列的示例:
date time time.of.day
1 2014-03-21 09:20:08 09:20:08
2 2014-03-21 10:05:22 10:05:22
3 2014-03-26 05:34:04 05:34:04
4 2014-03-26 09:35:05 09:35:05
5 2014-03-27 01:45:03 01:45:03
6 2014-03-27 02:45:27 02:45:27
7 2014-03-27 14:46:26 14:46:26
8 2014-03-28 04:03:30 04:03:30
为了方便以后的用户,以下是生成上述数据框的代码:
df <- data.frame(
date = c("2014-03-21", "2014-03-21", "2014-03-26", "2014-03-26", "2014-03-27", "2014-03-27", "2014-03-27", "2014-03-28"),
time = c("09:20:08", "10:05:22", "05:34:04", "09:35:05", "01:45:03", "02:45:27", "14:46:26", "04:03:30"),
time.of.day = c("09:20:08", "10:05:22", "05:34:04", "09:35:05", "01:45:03", "02:45:27", "14:46:26", "04:03:30")
)
P.S。:我在之前的工作中用独特的grep和字符串完成了这项工作。
能否请你帮忙?谢谢
好的,所以我用“[”来解决这个问题。但我仍然很好奇为什么它不能与水平和列表一起使用?
df$time.of.day[df$time >= "00:00:00" & df$time <= "07:00:00"] <- "morning"
df$time.of.day[df$time >= "07:00:00" & df$time <= "10:00:00"] <- "home2work"
df$time.of.day[df$time >= "10:00:00" & df$time <= "16:00:00"] <- "mid_day"
df$time.of.day[df$time >= "16:00:00" & df$time <= "19:00:00"] <- "work2home"
df$time.of.day[df$time >= "19:00:00" & df$time <= "23:59:59"] <- "night"
其他选择是:
library(chron)
indx <- c('00:00:00', '07:00:00', '10:00:00', '16:00:00',
'19:00:00', '23:59:59')
indx2 <- c('morning', 'home2work', 'mid_day', 'work2home', 'night')
h1 <- chron(times=df$time)
br <- chron(times=indx)
df$time.of.day <- cut(h1, br, labels=indx2)
df$time.of.day
#[1] home2work mid_day morning home2work morning morning mid_day
#[8] morning
#Levels: morning home2work mid_day work2home night
或者你可以这样做:
indx3 <- max.col(t(Vectorize(function(x) x>=indx[-length(indx)] &
x<= indx[-1])(df$time)), 'first')
indx2[indx3]
# [1] "home2work" "mid_day" "morning" "home2work" "morning" "morning"
# [7] "mid_day" "morning"
df <- structure(list(date = c("2014-03-21", "2014-03-21", "2014-03-26",
"2014-03-26", "2014-03-27", "2014-03-27", "2014-03-27", "2014-03-28"
), time = c("09:20:08", "10:05:22", "05:34:04", "09:35:05", "01:45:03",
"02:45:27", "14:46:26", "04:03:30"), time.of.day = c("09:20:08",
"10:05:22", "05:34:04", "09:35:05", "01:45:03", "02:45:27", "14:46:26",
"04:03:30")), .Names = c("date", "time", "time.of.day"), class = "data.frame",
row.names = c("1", "2", "3", "4", "5", "6", "7", "8"))