在 Excel 工作簿中,我有一个包含关键字列的表,我想将其与另一个表列的行中的大文本字符串匹配。
关键词表:
关键词 |
---|
不是问题 |
同样的问题 |
不推荐 |
无法忍受 |
“我能听到” |
“媒人” |
数据表:
文字 |
---|
被盗的随身碟不是问题 |
我每天需要做的事情 |
视野不是问题 |
对自己好一点 |
汽车也有同样的问题 |
我需要给她找个“媒人” |
请勿推荐口味 |
你能听到我说话吗? “我能听到你的声音” |
我无法忍受愚蠢的行为 |
伙计们保持平稳! |
...
我需要一个正则表达式,它可以返回关键字(多个单词)与“文本”列中字符串的整个字符串匹配,这样带/不带撇号的关键字应该在字符串中匹配
例如
dont
和 don't
应匹配。
同样,can't
和cant
或its
和it's
应该将关键字与字符串匹配。
另外 double-quotes enclosing keywords
在与字符串匹配时应该被忽略。
正则表达式匹配应返回整个字符串。
有人知道该怎么做吗?
' Loop through each keyword and search for matches
For Each keywordCell In KeywordsRange
Dim keyword As String
keyword = keywordCell.Value
' Set the regex pattern based on whether the keyword is enclosed in double-quotes
If Left(keyword, 1) = """" And Right(keyword, 1) = """" Then
keyword = Mid(keyword, 2, Len(keyword) - 2)
regex.pattern = "\b" & keyword & "\b"
Else
regex.pattern = "\b" & keyword & "\b|\b" & Replace(keyword, "'", "\'") & "\b"
End If
' Search for matches in concatenated string
Set matchCollection = regex.Execute(concatenatedString)
' Loop through matches and populate the output table
' <== WOULD BE GOOD TO COLLECT ALL MATCHES IN 2-D ARRAY AND RESIZE OUTPUT SHEET CELL WITH IT
For Each regexMatch In matchCollection
lrow = wkOutput.Cells(wkOutput.Rows.Count, 1).End(xlUp).Row + 1
newRow.Range(lrow, 1).Value = regexMatch.SubMatches(0) ' Interviewee
newRow.Range(lrow, 2).Value = regexMatch.Value ' Matched text
newRow.Range(lrow, 3).Value = keyword ' Keyword
Next regexMatch
Next keywordCell
请尝试一下。
For Each keywordCell In KeywordsRange
Dim keyword As String
keyword = keywordCell.Value
' Set the regex pattern based on whether the keyword is enclosed in double-quotes
If Left(keyword, 1) = """" And Right(keyword, 1) = """" Then
keyword = Mid(keyword, 2, Len(keyword) - 2)
regex.Pattern = "\b" & keyword & "\b"
Else
' remove "'" instead of "\'"
regex.Pattern = "\b" & keyword & "\b|\b" & Replace(keyword, "'", "") & "\b"
End If
' Search for matches in concatenated string
Set matchCollection = regex.Execute(concatenatedString)
' Loop through matches and populate the output table
' <== WOULD BE GOOD TO COLLECT ALL MATCHES IN 2-D ARRAY AND RESIZE OUTPUT SHEET CELL WITH IT
Set wkoutput = Sheet2
For Each regexMatch In matchCollection
lrow = wkoutput.Cells(wkoutput.Rows.Count, 2).End(xlUp).Row + 1
If regexMatch.SubMatches.Count > 0 Then
wkoutput.Cells(lrow, 1).Value = regexMatch.SubMatches(0) ' Interviewee
End If
wkoutput.Cells(lrow, 2).Value = regexMatch.Value ' Matched text
wkoutput.Cells(lrow, 3).Value = keyword ' Keyword
Next regexMatch
Next keywordCell