一年来第一次使用R,感觉有点迷茫。我在导入时间序列的混乱 csv 时遇到问题。事实上,当我运行以下代码时,我收到一条错误消息:
test_2 <- read.csv(file = "C:/Users/Downloads/dataset_univarie.csv", sep = ";",
header = TRUE)
View(test_2)
test_3 <- test_2[-(504:515),]
View(test_3)
到这里为止,一切都很好,但是当我尝试以不同的方式导入原始数据集以使 R 理解第一列是日期列时,为了能够充分处理时间序列,我很挣扎:
read.csv(file = "C:/Users/Downloads/dataset_univarie.csv")
serie_elec <- read_delim("C:/Users/Downloads/dataset_univarie.csv"," ; ",
escape_double = FALSE, col_types = cols("Période" =
col_date(format = "%Y-%m"), "Production.brute.d.électricité.nucléaire..en.GWh." =
col_number()), trim_ws = TRUE)
这是我收到的错误消息:
Warning message:
The following named parsers don't match the column names: Période,
Production.brute.d.électricité.nucléaire..en.GWh.
因此,我得到的数据框不是一个(我想得到一个两列的数据框,比如 test_3):
dput(head(serie_elec))
structure(list(`Période;Production brute d'électricité nucléaire (en GWh)` = c("2022-
11;22951.429",
"2022-10;21465.026", "2022-09;19334.531", "2022-08;19319.365",
"2022-07;19923.664", "2022-06;21275.248")), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))
这里是我的可重复性数据集的头部(这是法国电力生产的时间序列)
structure(list(Période = c("2022-11", "2022-10", "2022-09",
"2022-08", "2022-07", "2022-06"), Production.brute.d.électricité.nucléaire..en.GWh. =
c(22951.429,
21465.026, 19334.531, 19319.365, 19923.664, 21275.248)), row.names = c(NA,
6L), class = "data.frame")
将
delim = " ; "
更改为 delim = ";"
会解决问题吗?