如何显示单列数据帧的每一行中匹配的是哪个特殊字符?
示例数据框:
a <- data.frame(name=c("foo","bar'","ip_sum","four","%23","2_planet!","@abc!!"))
确定字符串是否具有特殊字符:
a$name_cleansed <- gsub("([-./&,])|[[:punct:]]","\\1",a$name) #\\1 puts back the exception we define (dash and slash)
a <- a %>% mutate(has_special_char=if_else(name==name_cleansed,FALSE,TRUE))
如果我们只需要第一个特殊字符,则可以使用str_extract
。
library(stringr)
str_extract(a$name,'[[:punct:]]')
#[1] NA "'" "_" NA "%" "_" "@"
如果需要所有特殊字符,可以使用str_extract_all
。
sapply(str_extract_all(a$name,'[[:punct:]]'), function(x) toString(unique(x)))
#[1] "" "'" "_" "" "%" "_, !" "@, !"
我们可以在此处将grepl
用于基本R选项:
a <- data.frame(name=c("foo","bar'","ip_sum","four","%23","2_planet!","@abc!!"))
a$has_special_char <- grepl("[[:punct:]]", a$name)
a
name has_special_char
1 foo FALSE
2 bar' TRUE
3 ip_sum TRUE
4 four FALSE
5 %23 TRUE
6 2_planet! TRUE
7 @abc!! TRUE