R中多个模式的grep给出无法解释的结果

问题描述 投票:1回答:1
ExecKeywords <- c('cio','cto','cco','coo','ciso','cso','cdo','cdio',
'Chief Information','CIO','Chief Technology Officer','Chief Compliance Officer','Chief Security')


Titles <- c('Director - Customer Success','CIO','Director Cloud Operations',
'Director of Information Technology and Chief Information Security Officer',
'Director, Information Services','Director, Global Information Technology',
'Chief Technology Officer','Sr. Director','COO / CTO Advice Company',
'Director of Information Technology','Director of Technology',
'Vice President, Platform Operations and Information Technology',
'Accounting Manager','VP, Strategy and Programs','IT Director','CTO',
'Director of Network Services','Director','Director, Application Engineering',
'Deputy Director of Technology')



grep(paste(ExecKeywords, collapse = "|"), Titles, value = T)


我正在尝试识别与ExecKeywords中找到的多个模式之一匹配的标题。在ExecKeywords的每个元素之后,管道之前和管道之后(处于折叠状态)放置一个空格似乎都在做一些不同的事情,但并不是我想要的。所有帖子都引用了粘贴和折叠方法,但这似乎对我不起作用...我错过了什么吗? ignore.case似乎也无法正常工作]

应该期待这样的回报

'CIO','Director of Information Technology and Chief Information Security Officer','Chief Technology Officer','COO / CTO Advice Company','CTO'

r analysis data-wrangling
1个回答
0
投票

一种选择是也具有单词边界paste d,以避免匹配单词中的子字符串

grep(paste0("\\b(", paste(ExecKeywords, collapse = "|"), ")\\b"),
       Titles, value = TRUE, ignore.case = TRUE)
#[1] "CIO"                                                                      
#[2] "Director of Information Technology and Chief Information Security Officer"
#[3] "Chief Technology Officer"                                                 
#[4] "COO / CTO Advice Company"                                                 
#[5] "CTO"       
© www.soinside.com 2019 - 2024. All rights reserved.