我有以下最小的.csv
文件:
"Sl.no","Col1","Col2","Col3"
"1","one","two","three",
"2","A","B","C",
当我在Excel或Google表格中打开它时,正确导入文件。
当我用R
将它导入RStudio
时
temp <- read.csv("file.csv", header = TRUE)
我在temp
中看到以下内容:
列标题是一个。
当我删除第二行和第三行中的逗号时,即导入此文件:
"Sl.no","Col1","Col2","Col3"
"1","one","two","three"
"2","A","B","C"
正确读取文件,temp
的结果是
问题:
.csv
文件中是否允许使用尾随逗号?如果不是Excel和Google表格只是原谅?regex
,但不知道如何将更改作为文本文件,然后读作.csv
您可以尝试使用gsub
添加尾随逗号,然后使用read.csv
进行阅读
edited <- gsub(",,", ",", paste0(readLines("~/Desktop/file.csv"), ","), fixed = TRUE)
read.csv(textConnection(edited), header = TRUE, stringsAsFactors = FALSE)[1:4]
#> Sl.no Col1 Col2 Col3
#> 1 1 one two three
#> 2 2 A B C
说明:首先使用readLines
“按原样”导入文本。接下来,使用paste0
在每行的末尾添加逗号。之后,用“,”替换“,,”的任何实例。最后,您使用textConnection
和read.csv
来读取文件。请注意,我将[1:4]只读取前4列。出于某种原因,我不断得到一个空白的第五列(可能来自我编写csv文件的方式)。
我建议采用不同的方法,使用read_csv
包中的readr
:
library(readr)
temp <- read_csv("file.csv")
temp
# A tibble: 2 x 4
Sl.no Col1 Col2 Col3
<int> <chr> <chr> <chr>
1 1 one two three
2 2 A B C
使用的数据:
"Sl.no","Col1","Col2","Col3"
"1","one","two","three",
"2","A","B","C",