我有一个来自调查的字符串向量,该向量包含有关人员职位的信息。一些回复是:首席执行官、首席执行官、首席执行官、首席执行官/所有者、首席执行官/创始人。
我想将任何包含 ceo 一词的字符串(大写、小写、后空格、前空格)替换为 CEO。
我已经尝试过这些代码,它们替换了一些,但不是全部。
aiprm$job_title <- gsub("ceo|CEO|owner|Ceo|Owner|executive|CEO |CEo|CE0|CEO/CEO|\\ceo|\\CEO", "CEO",aiprm$job_title)
还缺少一些像:经纪人/首席执行官,商业首席执行官/健康教练,首席执行官兼制片人,首席执行官/创意总监,首席执行官/设计师,首席执行官/运营商。
grep
找到 "ceo"
匹配的位置(不区分大小写)并替换整个内容。
quux[grep("ceo", quux$job_title, ignore.case = TRUE),1:2]
# job_title industry
# 14 CEO Advertising
# 28 CEO Marketing Agency
# 64 CEO Education
# 70 Founder & CEO AI Consulting
# 81 CEO ZK ART criations
# 83 CEO Marketing
# 110 Ceo Digital marketing
# 111 CEO Web Design
# 120 CEO Marketing & Advertisement
# 124 CEO Trainings
# 125 CEO Healthcare
# 128 CEO consultation
# 132 CEO IT-Services
# 144 Ceo Media
# 167 CEO BRANDING AND PRINTING
# 176 CEO Civil Engineering
# 180 ceo software
# 195 ceo marketing digital
# 210 CEO & Producer Comm (Rádio and Audio Producer)
# 217 CEO services
# 253 CEO Trucking; Travel, e-commerce
# 256 CEO Home
# 262 President, CEO Management Consulting
# 272 CEO Short Term Rentals and Hospitality
# 280 ceo eletrônicos
# 285 Ceo/owner Entrepreneur
# 312 CEO Nonprofit- services for people with I/DD
# 316 Ceo Marketing
# 321 Founder & CEO Digital Media
# 330 CEO PR
# 333 CEO Marketing
# 337 ceo agri
# 359 CEO Marketing
# 366 CEO Media
# 378 CEO IT/SocialNetwork
# 404 CEO Digital Marketing Agency
# 419 CEO Publicité
# 431 Ceo Ceo
# 439 CEO SaaS
# 442 CEO Digital Marketing
# 443 CEO wellness and health, Real State
# 445 Owner/ceo Disability
# 452 CEO Advising and Entrepreneurship
# 453 CEO KCARBONFREE G-W-RBIO
# 458 CEO Software
quux$job_title[grepl("ceo", quux$job_title, ignore.case = TRUE)] <- "CEO"
quux[grep("ceo", quux$job_title, ignore.case = TRUE),1:2]
# job_title industry
# 14 CEO Advertising
# 28 CEO Marketing Agency
# 64 CEO Education
# 70 CEO AI Consulting
# 81 CEO ZK ART criations
# 83 CEO Marketing
# 110 CEO Digital marketing
# 111 CEO Web Design
# 120 CEO Marketing & Advertisement
# 124 CEO Trainings
# 125 CEO Healthcare
# 128 CEO consultation
# 132 CEO IT-Services
# 144 CEO Media
# 167 CEO BRANDING AND PRINTING
# 176 CEO Civil Engineering
# 180 CEO software
# 195 CEO marketing digital
# 210 CEO Comm (Rádio and Audio Producer)
# 217 CEO services
# 253 CEO Trucking; Travel, e-commerce
# 256 CEO Home
# 262 CEO Management Consulting
# 272 CEO Short Term Rentals and Hospitality
# 280 CEO eletrônicos
# 285 CEO Entrepreneur
# 312 CEO Nonprofit- services for people with I/DD
# 316 CEO Marketing
# 321 CEO Digital Media
# 330 CEO PR
# 333 CEO Marketing
# 337 CEO agri
# 359 CEO Marketing
# 366 CEO Media
# 378 CEO IT/SocialNetwork
# 404 CEO Digital Marketing Agency
# 419 CEO Publicité
# 431 CEO Ceo
# 439 CEO SaaS
# 442 CEO Digital Marketing
# 443 CEO wellness and health, Real State
# 445 CEO Disability
# 452 CEO Advising and Entrepreneurship
# 453 CEO KCARBONFREE G-W-RBIO
# 458 CEO Software
这是上面使用的数据示例,“全部”对于堆栈答案来说太大了。
quux <- structure(list(job_title = c("Marketing Director", "Digital Content Manager", "Owner", "Content Principal", "Chief Consultant", "Managing Director", "Senior SEO Analyst", "Senior SEO Specialist", "Head of Content", "", "", "", "content manager", "CEO", "Head of SEO", "Owner/SEO Consultant", "Snr Manager, SEO + Talent", "Head of Web and Digital Communications", "VP of SEO & Content", "SEO consultant"), industry = c("Sporting Goods", "Marketing and Advertising", "Software", "Tech", "Business technology services", "Marketing", "SEO", "Automotive", "Marketing", "", "", "", "entertainment", "Advertising", "Digital Marketing", "Marketing", "SEO", "Online publishing", "private equity / finance", "SEO")), row.names = c(NA, 20L), class = "data.frame")