我有一个要导入到R中的csv数据文件,但一列出现问题。
data <- read.csv("example.csv", header = F, stringsAsFactors = F, sep = ";", colClasses = c("character","character", rep("character", 2),"numeric",rep("character",6),rep("numeric",5)))
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
scan() expected 'a real', got '"N'
[我意识到这是因为一列包含缺失值,并列为“N。例如,一条有问题的数据行如下:
11e9-afea;"N;"Passenger Car";"SEDAN";"2019";"BMW";"M5";"Gas";"All Wheel Drive";"Automatic";"";"18.1512";"29.0277";"15";"17";"21"
据推测,如下所示的数据行没有问题:
11e9-afea;"4";"Passenger Car";"SEDAN";"2019";"BMW";"M5";"Gas";"All Wheel Drive";"Automatic";"";"18.1512";"29.0277";"15";"17";"21"
我该如何处理所有的“ N并正确导入数据?
谢谢!
您可以尝试使用read_csv
中的readr
功能。您可以指定要在参数"N"
中将NA
视为na
,例如na=c("NA","","N")
read_csv(file, col_names = TRUE, col_types = NULL,
locale = default_locale(), na = c("", "NA"), quoted_na = TRUE,
quote = "\"", comment = "", trim_ws = TRUE, skip = 0,
n_max = Inf, guess_max = min(1000, n_max),
progress = show_progress(), skip_empty_rows = TRUE)