在Excel中,如何在左侧的范围内找到相同的单词?

问题描述 投票:0回答:3

我抓了一个卖家网站,现在需要从标题中找到相同的产品变体。如何为变体标题找到相同的单词?

以下示例: EXAMPLE IN THE BELOW

excel vba duplicates range formula
3个回答
1
投票

我更喜欢通过VBA这样做,但它仍然可以使用公式。

查看示例字符串,您需要从右侧删除一个,两个或三个单词以提取公共字符串。

你需要的是从末尾删除最后一个单词,并使用*通配符查找完全匹配。

如果删除一个单词不足以找到匹配项,则应该为两个单词执行此操作,然后对三个单词执行此操作,依此类推,如果您需要更多单词。

下面是分别剥离一个,两个和三个单词的公式的输出:

enter image description here

CELL B2:从右边剥离一个单词并寻找匹配的公式

=IFNA(IFNA(IF(MATCH("*"&LEFT(A2,FIND("[",SUBSTITUTE(A2," ","[",LEN(A2)-LEN(SUBSTITUTE(A2," ",""))))-1)&"*",A3:$A$14,0)>0,LEFT(A2,FIND("[",SUBSTITUTE(A2," ","[",LEN(A2)-LEN(SUBSTITUTE(A2," ",""))))-1),""),IF(MATCH("*"&LEFT(A2,FIND("[",SUBSTITUTE(A2," ","[",LEN(A2)-LEN(SUBSTITUTE(A2," ",""))))-1)&"*",$A$1:A1,0)>0,LEFT(A2,FIND("[",SUBSTITUTE(A2," ","[",LEN(A2)-LEN(SUBSTITUTE(A2," ",""))))-1),"")),"STRIPPING ONE WORD NOT ENOUGH")

CELL C2:剥离一个单词的公式,ifna然后是右边的两个单词并寻找匹配

=IFNA(IFNA(IF(MATCH("*"&LEFT(A2,FIND("[",SUBSTITUTE(A2," ","[",LEN(A2)-LEN(SUBSTITUTE(A2," ",""))))-1)&"*",A3:$A$14,0)>0,LEFT(A2,FIND("[",SUBSTITUTE(A2," ","[",LEN(A2)-LEN(SUBSTITUTE(A2," ",""))))-1),""),IF(MATCH("*"&LEFT(A2,FIND("[",SUBSTITUTE(A2," ","[",LEN(A2)-LEN(SUBSTITUTE(A2," ",""))))-1)&"*",$A$1:A1,0)>0,LEFT(A2,FIND("[",SUBSTITUTE(A2," ","[",LEN(A2)-LEN(SUBSTITUTE(A2," ",""))))-1),"")),IFNA(IFNA(IF(MATCH("*"&LEFT(A2,FIND("[",SUBSTITUTE(A2," ","[",LEN(A2)-LEN(SUBSTITUTE(A2," ",""))-1))-1)&"*",A3:$A$14,0)>0,LEFT(A2,FIND("[",SUBSTITUTE(A2," ","[",LEN(A2)-LEN(SUBSTITUTE(A2," ",""))-1))-1),""),IF(MATCH("*"&LEFT(A2,FIND("[",SUBSTITUTE(A2," ","[",LEN(A2)-LEN(SUBSTITUTE(A2," ",""))-1))-1)&"*",$A$1:A1,0)>0,LEFT(A2,FIND("[",SUBSTITUTE(A2," ","[",LEN(A2)-LEN(SUBSTITUTE(A2," ",""))-1))-1),"")),"STRIPPING TWO WORDS NOT ENOUGH"))

CELL D2:剥离一个单词的公式,ifna然后是两个单词,ifna然后是右边的三个单词并寻找匹配

=IFNA(IFNA(IF(MATCH("*"&LEFT(A2,FIND("[",SUBSTITUTE(A2," ","[",LEN(A2)-LEN(SUBSTITUTE(A2," ",""))))-1)&"*",A3:$A$14,0)>0,LEFT(A2,FIND("[",SUBSTITUTE(A2," ","[",LEN(A2)-LEN(SUBSTITUTE(A2," ",""))))-1),""),IF(MATCH("*"&LEFT(A2,FIND("[",SUBSTITUTE(A2," ","[",LEN(A2)-LEN(SUBSTITUTE(A2," ",""))))-1)&"*",$A$1:A1,0)>0,LEFT(A2,FIND("[",SUBSTITUTE(A2," ","[",LEN(A2)-LEN(SUBSTITUTE(A2," ",""))))-1),"")),IFNA(IFNA(IF(MATCH("*"&LEFT(A2,FIND("[",SUBSTITUTE(A2," ","[",LEN(A2)-LEN(SUBSTITUTE(A2," ",""))-1))-1)&"*",A3:$A$14,0)>0,LEFT(A2,FIND("[",SUBSTITUTE(A2," ","[",LEN(A2)-LEN(SUBSTITUTE(A2," ",""))-1))-1),""),IF(MATCH("*"&LEFT(A2,FIND("[",SUBSTITUTE(A2," ","[",LEN(A2)-LEN(SUBSTITUTE(A2," ",""))-1))-1)&"*",$A$1:A1,0)>0,LEFT(A2,FIND("[",SUBSTITUTE(A2," ","[",LEN(A2)-LEN(SUBSTITUTE(A2," ",""))-1))-1),"")),IFNA(IFNA(IF(MATCH("*"&LEFT(A2,FIND("[",SUBSTITUTE(A2," ","[",LEN(A2)-LEN(SUBSTITUTE(A2," ",""))-2))-1)&"*",A3:$A$14,0)>0,LEFT(A2,FIND("[",SUBSTITUTE(A2," ","[",LEN(A2)-LEN(SUBSTITUTE(A2," ",""))-2))-1),""),IF(MATCH("*"&LEFT(A2,FIND("[",SUBSTITUTE(A2," ","[",LEN(A2)-LEN(SUBSTITUTE(A2," ",""))-2))-1)&"*",$A$1:A1,0)>0,LEFT(A2,FIND("[",SUBSTITUTE(A2," ","[",LEN(A2)-LEN(SUBSTITUTE(A2," ",""))-2))-1),"")),"STRIPPING THREE WORDS NOT ENOUGH")))

查看问题中的示例字符串,您应该使用单元格D2中的公式,因为您要删除三个或更少的单词。如果您有四个或更多,请按照逻辑并用一个单词的公式替换“剥去三个单词不够”(并重复此单词以获得更多单词)。只需复制/粘贴列表其余部分的公式即可。

提醒:我的清单在A13结束,所以我在公式中使用了A14。您应该根据列表的长度进行更改。

另一个提醒:这些公式基于您的示例字符串,因此如果您在原始列表中遇到此公式的任何问题,那么您还应该将这些示例字符串包含在您的问题中以获得可行的解决方案。


0
投票

您可以使用VLOOKUP。根据本教程(https://www.excel-university.com/perform-approximate-match-and-fuzzy-lookup-in-excel/),

=VLOOKUP(C7, Table1, 2, FALSE)

哪里:

  • C7是要查找的值
  • Table1是查找范围
  • 2是具有我们希望返回的值的列
  • FALSE表示我们没有执行范围查找

0
投票

如果你有安装了Microsoft Excel 2010或更新版本的Microsoft Windows操作系统,我会考虑使用模糊匹配Microsoft加载项。

下载:https://www.microsoft.com/en-gb/download/details.aspx?id=15011

它易于使用,关键优势在于您可以在遇到新术语时轻松扩展它。您还可以以简单的方式手动检查匹配并调整匹配参数以获得更好的结果。

您可以创建映射表以实现目标。将简化示例视为具有与新值进行比较的预期表达式的参考表。您将获得与他们匹配/相同的可能性的分数。

来自MS Fuzzy read me

数据管理中的一个具有挑战性的问题是在整个数据集中可以以多种方式表示相同的实体。例如,客户“Andy Hill”也可能出现在“Mr. Andrew Hill“或”Hill,Andrew R.“。变换可能来自合并独立数据源,拼写错误,不一致的命名约定和缩写,或带有附加/缺失信息的记录。

由Microsoft Research开发的模糊查找技术允许您快速识别文本相似的数据记录。您可以在单个表中识别模糊重复项,或在两个不同的表之间执行模糊联接。

额外:

1)Tutorials用于加载项

2)极客:如果您对模糊匹配的技术应用感兴趣,请参阅@Alain和周围讨论的这个惊人的答案:

Getting the closest string match

3)模糊查找的技术资源

© www.soinside.com 2019 - 2024. All rights reserved.