使用grepl进行数据清理[:alpha:]:[:punct:]

问题描述 投票:1回答:1

只是使用grepl错误。需要在grepl中组合[:alpha:]:[:punct:]:用[:alpha:] [:punct:]查找/删除行。提供的输入数据。需要找到punct和alpha数据来删除角色并替换为NA或NaN。如何在R语言中将grepl与[:alpha:] [:punct:]结合起来?

grepl("[:alpha:]:[:punct:]:",df$Incoming.Examinations)
dput(df$Incoming.Examinations)

dput(abberville_LA$Incoming.Examinations)
c("698", "xx?*&?/..", "1934", "2294", "962", "724", "4978", 
"99999999", "4841", "Closed for Holidays", "*", "775", "634", "1276", "1320", 
"3455", "886", "1973", "2739", "311", "999999999", "939", "545", 
"3946", "2239", "1041", "411", "3258", "entered by J.f. williams", 
"1115", "*", "4729", "5008", "*", "*", "1011", "1065", "2262", 
"3459", "596", "776", "1866", "5000", "1578", "393", "*", "*", 
"875", "2772", "997", "664", "680", "4351", "1205", "732")
r data-cleaning grepl
1个回答
1
投票

如果打算将非数字元素转换为NA

as.numeric(v1)

将自动将所有其他元素转换为NA。

但是,如果我们需要使用grepl(),匹配一个或多个数字([0-9]+)从开始(^)到结束($)的字符串和否定(!

v1[!grepl("^[0-9]+$", v1)] <- NA

data

v1 <- c("698", "xx?*&?/..", "1934", "2294", "962", "724", "4978", "99999999", 
"4841", "Closed for Holidays", "*", "775", "634", "1276", "1320", 
"3455", "886", "1973", "2739", "311", "999999999", "939", "545", 
"3946", "2239", "1041", "411", "3258", "entered by J.f. williams", 
"1115", "*", "4729", "5008", "*", "*", "1011", "1065", "2262", 
"3459", "596", "776", "1866", "5000", "1578", "393", "*", "*", 
"875", "2772", "997", "664", "680", "4351", "1205", "732")
© www.soinside.com 2019 - 2024. All rights reserved.