如何删除R中除数字元素之外的所有内容

问题描述 投票:0回答:1

我很抱歉,因为肯定有很多类似的问题和答案,但我已经尝试了一堆建议的答案,遗憾的是没有骰子。

我在数据帧(tempdata)的三列中获得了温度数据。为简单起见,我只是尝试一次更改这些位置之一 (wentworth.castle)。

这就是我的数据的样子。所有带有“.castle”的列都是该站点的温度。存在缺失值,但这是预期的。希望能把他们变成NA。

glimpse(tempdata)
Rows: 3,395
Columns: 5
$ Description      <chr> "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", …
$ date.time        <chr> "22/11/2023 09:48", "22/11/2023 10:18", "22/11/2023 10:48", "22/11/2023 11:…
$ site.castle      <chr> "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",…
$ dover.castle     <chr> "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",…
$ wentworth.castle <chr> "9.484 \xb0C", "9.642 \xb0C", "9.768 \xb0C", "9.994 \xb0C", "10.066 \xb0C",…

我尝试了下面的一些方法,但出现了以下错误。

tempdata$wentworth.castle <- gsub(" �C", "", as.numeric(tempdata$wentworth.castle))
#Error in is.factor(x) : invalid multibyte string at '<b0>C'

tempdata$wentworth.castle <- gsub(" \xb0C", "", as.numeric(tempdata$wentworth.castle))
#Error in is.factor(x) : invalid multibyte string at '<b0>C'

tempdata$wentworth.castle = tempdata$wentworth.castle.replace('\u00b0','', regex=True)
#Error: attempt to apply non-function

tempdata$wentworth.castle <- as.numeric(tempdata$wentworth.castle)
#Error: invalid multibyte string at '<b0>C'

我还尝试了一种不太健壮的方法,并尝试创建一个函数来删除一定数量的字符后的内容,但这很困难,因为有时我的数据有 5 个有效数字,有时有 6 个有效数字,所以即使它有效,我也会有一些从某些条目中删除随机空格。

left = function(string, chat){substr(string, 1, char)}
tempdata$wentworth.castle <- left(tempdata$wentworth.castle, 6)
#Error in as.integer(stop) : 
#  cannot coerce type 'closure' to vector of type 'integer'
r dplyr gsub temperature
1个回答
0
投票

这是一个编码问题,您可以使用

iconv
进行转换,然后使用
gsub
删除您不想要的内容:

# data 
wentworth <- c("9.484 \xb0C", "9.642 \xb0C", "9.768 \xb0C", "9.994 \xb0C", "10.066 \xb0C")

gsub(" °C","", iconv(wentworth, from = "ISO-8859-1", to = "UTF-8"))

# [1] "9.484"  "9.642"  "9.768"  "9.994"  "10.066"
© www.soinside.com 2019 - 2024. All rights reserved.