我有一张带有几根柱子的大桌子,我已经在下面剪了一下。该表是通过MicrosoftAccess引入的,因此列“索引”遍布整个地方,特别是与之无关。基本上我想在表中添加另一列,按日期最早的方式索引行 - >最新。如此独立于任何其他标准,我希望最早的日期为“1”并按时间顺序排列在最后一个日期2,3,4,5等。
index- effort_ID- Tag ID- SUR- Date and Time
350162 - 244 - 92 - 10916 - 2016-12-14 19:25:00
77850 - 243 -77- 10913 -2016-12-14 19:28:10
77858 - 243- 79 -10913 -2016-12-14 19:39:11
以下是更好格式的数据:
df <- structure(list(index = c(350162, 77850, 77858), effort_ID = c(244,
243, 243), `Tag ID` = c(92, 77, 79), SUR = c(10916, 10913, 10913
), `Date and Time` = c("2016-12-14 19:25:00", "2016-12-14 19:28:10",
"2016-12-14 19:39:11")), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -3L))
然后使用lubridate
进入日期时间格式,arrange
并将行索引设置为tidyverse
中的列
library(lubridate)
library(tidyverse)
df2 <- df %>%
mutate(`Date and Time` = ymd_hms(`Date and Time`)) %>%
arrange(`Date and Time`) %>%
rownames_to_column(var = "new_index")
结果:
# A tibble: 3 x 6
new_index index effort_ID `Tag ID` SUR `Date and Time`
<chr> <dbl> <dbl> <dbl> <dbl> <dttm>
1 1 350162 244 92 10916 2016-12-14 19:25:00
2 2 77850 243 77 10913 2016-12-14 19:28:10
3 3 77858 243 79 10913 2016-12-14 19:39:11
首先使用dplyr包安排你的df然后使用mutate添加一个列来索引这很容易使用管道运算符(%>%
)
library(dplyr)
df %>% arrange(`Date and Time`) %>%
mutate(new_index = 1:nrow(df))
一个base R
解决方案
df <- df[order(df$`Date and Time`),]
df$date_index <- 1:nrow(df)
也是一种不使用库的可能解决方案
str<-c( "77858 - 243- 79 -10913 -2016-12-14 19:39:11",
"350162 - 244 - 92 - 10916 - 2016-12-14 19:25:00",
"77850 - 243 -77- 10913 -2016-12-14 19:28:10")
customer<-c("lina","rita","mina")
df <- data.frame(cust=customer,date=str)
df
cust date
1 lina 77858 - 243- 79 -10913 -2016-12-14 19:39:11
2 rita 350162 - 244 - 92 - 10916 - 2016-12-14 19:25:00
3 mina 77850 - 243 -77- 10913 -2016-12-14 19:28:10
后
str<-as.character(substr(str,(nchar(str)+1)-19,nchar(str)))
str
"2016-12-14 19:39:11" "2016-12-14 19:25:00" "2016-12-14 19:28:10"
df$newDate=strptime(str, "%Y-%m-%d %H:%M:%S")
rownames(df) <- order(df$newDate)
df
cust date newDate
2 lina 77858 - 243- 79 -10913 -2016-12-14 19:39:11 2016-12-14 19:39:11
3 rita 350162 - 244 - 92 - 10916 - 2016-12-14 19:25:00 2016-12-14 19:25:00
1 mina 77850 - 243 -77- 10913 -2016-12-14 19:28:10 2016-12-14 19:28:10
最后
df[order(as.numeric(rownames(df))),,drop=FALSE]
df
cust date newDate
1 mina 77850 - 243 -77- 10913 -2016-12-14 19:28:10 2016-12-14 19:28:10
2 lina 77858 - 243- 79 -10913 -2016-12-14 19:39:11 2016-12-14 19:39:11
3 rita 350162 - 244 - 92 - 10916 - 2016-12-14 19:25:00 2016-12-14 19:25:00