我很抱歉,因为肯定有很多类似的问题和答案,但我已经尝试了一堆建议的答案,遗憾的是没有骰子。
我在数据帧(tempdata)的三列中获得了温度数据。为简单起见,我只是尝试一次更改这些位置之一 (wentworth.castle)。
这就是我的数据的样子。所有带有“.castle”的列都是该站点的温度。存在缺失值,但这是预期的。希望能把他们变成NA。
glimpse(tempdata)
Rows: 3,395
Columns: 5
$ Description <chr> "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", …
$ date.time <chr> "22/11/2023 09:48", "22/11/2023 10:18", "22/11/2023 10:48", "22/11/2023 11:…
$ site.castle <chr> "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",…
$ dover.castle <chr> "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",…
$ wentworth.castle <chr> "9.484 \xb0C", "9.642 \xb0C", "9.768 \xb0C", "9.994 \xb0C", "10.066 \xb0C",…
我尝试了下面的一些方法,但出现了以下错误。
tempdata$wentworth.castle <- gsub(" �C", "", as.numeric(tempdata$wentworth.castle))
#Error in is.factor(x) : invalid multibyte string at '<b0>C'
tempdata$wentworth.castle <- gsub(" \xb0C", "", as.numeric(tempdata$wentworth.castle))
#Error in is.factor(x) : invalid multibyte string at '<b0>C'
tempdata$wentworth.castle = tempdata$wentworth.castle.replace('\u00b0','', regex=True)
#Error: attempt to apply non-function
tempdata$wentworth.castle <- as.numeric(tempdata$wentworth.castle)
#Error: invalid multibyte string at '<b0>C'
我还尝试了一种不太健壮的方法,并尝试创建一个函数来删除一定数量的字符后的内容,但这很困难,因为有时我的数据有 5 个有效数字,有时有 6 个有效数字,所以即使它有效,我也会有一些从某些条目中删除随机空格。
left = function(string, chat){substr(string, 1, char)}
tempdata$wentworth.castle <- left(tempdata$wentworth.castle, 6)
#Error in as.integer(stop) :
# cannot coerce type 'closure' to vector of type 'integer'
这是一个编码问题,您可以使用
iconv
进行转换,然后使用 gsub
删除您不想要的内容:
# data
wentworth <- c("9.484 \xb0C", "9.642 \xb0C", "9.768 \xb0C", "9.994 \xb0C", "10.066 \xb0C")
gsub(" °C","", iconv(wentworth, from = "ISO-8859-1", to = "UTF-8"))
# [1] "9.484" "9.642" "9.768" "9.994" "10.066"