我有一个包含列的数据集,其中值是
month year
,格式如下
M_Yr
March 1990
April 1990
May 1990
June 1990
July 1990
Aug 1990
Sept 1990
Oct 1990
Nov 1990
Dec 1990
Jan 1991
Feb 1991
March 1991
April 1991
May 1991
June 1991
July 1991
Aug 1991
Sept 1991
Oct 1991
Nov 1991
Dec 1991
我喜欢将其转换为这样的数值
M_Yr Col1
March 1990 1
April 1990 2
May 1990 3
June 1990 4
July 1990 5
Aug 1990 6
Sept 1990 7
Oct 1990 8
Nov 1990 9
Dec 1990 10
Jan 1991 11
Feb 1991 12
March 1991 13
April 1991 14
May 1991 15
June 1991 16
July 1991 17
Aug 1991 18
Sept 1991 19
Oct 1991 20
Nov 1991 21
Dec 1991 22
我尝试过这种方法。
df$Col1 <- as.numeric(df$M_Yr)
这确实将
month Year
变量转换为数字,但顺序是混乱的并且顺序不正确。所以我想知道什么是创建这个数字变量而不需要编写冗长的 case_when
语句的有效方法。
非常感谢任何建议。谢谢。
一种方法是将日期字符串转换为日期并获取它们的顺序。这是使用
lubridate
包的一种方法:
library(lubridate)
library(dplyr)
df %>%
mutate(date = my(M_Yr),
numeric = order(date))
输出:
M_Yr date numeric
1 March 1990 1990-03-01 1
2 April 1990 1990-04-01 2
3 May 1990 1990-05-01 3
4 June 1990 1990-06-01 4
5 July 1990 1990-07-01 5
6 Aug 1990 1990-08-01 6
7 Sept 1990 1990-09-01 7
8 Oct 1990 1990-10-01 8
9 Nov 1990 1990-11-01 9
10 Dec 1990 1990-12-01 10
11 Jan 1991 1991-01-01 11
12 Feb 1991 1991-02-01 12
13 March 1991 1991-03-01 13
14 April 1991 1991-04-01 14
15 May 1991 1991-05-01 15
16 June 1991 1991-06-01 16
17 July 1991 1991-07-01 17
18 Aug 1991 1991-08-01 18
19 Sept 1991 1991-09-01 19
20 Oct 1991 1991-10-01 20
21 Nov 1991 1991-11-01 21
22 Dec 1991 1991-12-01 22
dput(df)
structure(list(M_Yr = c("March 1990", "April 1990", "May 1990",
"June 1990", "July 1990", "Aug 1990", "Sept 1990", "Oct 1990",
"Nov 1990", "Dec 1990", "Jan 1991", "Feb 1991", "March 1991",
"April 1991", "May 1991", "June 1991", "July 1991", "Aug 1991",
"Sept 1991", "Oct 1991", "Nov 1991", "Dec 1991")), class = "data.frame", row.names = c(NA,
-22L))