我有以下 Elasticsearch 搜索配置:
query_string: {
query: `${sanitizedQueryString}~`,
fields: ['fieldOne^5', 'fieldTwo^5', 'fieldThree'],
},
上面将我给出的每个搜索字符串分解为单独的单词,并搜索搜索字符串中的每个单词。如果我有以下记录:
[{
id: 1,
name: 'Harris'
}, {
id: 2,
name: 'Smith'
}, {
id: 3,
name: 'Dallas'
}, {
id: 4,
name: 'Farmers And Workers'
}];
我的搜索查询是
harris smith
,然后我会返回上面数组中的前 2 条记录。
如果我传入搜索查询
harris and smith
,我会返回 3 条记录 - 前 2 条记录和最后一条记录。在这种情况下,将返回最后一条记录,因为它的名称中包含单词 And
,并且我的搜索查询也包含单词 and
。
像
and
、of
和or
这样的词在英语中称为连词。如何排除搜索查询中的连词进行分析和搜索?
stop
过滤器
映射
PUT /conjunctions
{
"settings": {
"analysis": {
"analyzer": {
"stop_lowercase_whitespace_analyzer": {
"tokenizer": "whitespace",
"filter": [
"stop",
"lowercase"
]
}
}
}
},
"mappings": {
"properties": {
"id": {
"type": "integer"
},
"name": {
"type": "text",
"analyzer": "stop_lowercase_whitespace_analyzer"
}
}
}
}
文件
PUT /conjunctions/_bulk
{"create":{"_id":1}}
{"id":1,"name":"Harris"}
{"create":{"_id":2}}
{"id":2,"name":"Smith"}
{"create":{"_id":3}}
{"id":3,"name":"Dallas"}
{"create":{"_id":4}}
{"id":4,"name":"Farmers And Workers"}
查询
GET /conjunctions/_search?filter_path=hits.hits
{
"query": {
"match": {
"name": "harris and smith"
}
}
}
回应
{
"hits" : {
"hits" : [
{
"_index" : "conjunctions",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.3940738,
"_source" : {
"id" : 1,
"name" : "Harris"
}
},
{
"_index" : "conjunctions",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.3940738,
"_source" : {
"id" : 2,
"name" : "Smith"
}
}
]
}
}
您可以更改停用词列表。请参阅文档
看来您正在使用默认的标准分析器,其中默认情况下禁用停止令牌过滤器。这就是您的代币的创建方式 -
POST _analyze
{
"analyzer": "standard",
"text": "harris and smith"
}
让我们调整映射和分析器
PUT /standard_example
{
"settings": {
"analysis": {
"analyzer": {
"rebuilt_standard": {
"tokenizer": "standard",
"filter": [
"lowercase",
"stop"
]
}
}
}
},
"mappings": {
"properties": {
"name":{
"type": "text",
"analyzer": "rebuilt_standard"
}
}
}
}
索引数据
PUT /standard_example/_bulk
{"create":{"_id":1}}
{"id":1,"name":"Harris"}
{"create":{"_id":2}}
{"id":2,"name":"Smith"}
{"create":{"_id":3}}
{"id":3,"name":"Dallas"}
{"create":{"_id":4}}
{"id":4,"name":"Farmers And Workers"}
使用查询字符串执行搜索 -
GET standard_example/_search
{
"query": {
"query_string": {
"query": "Harris and Smith",
"fields": ["name"]
}
}
}