弹性建议自动完成:结果不是预期的

问题描述 投票:0回答:1

我很难理解使用建议 API 时得到的结果。

目标是我不希望返回这个结果。

如何重现 - 这是我的映射:


PUT /movies
{
  "settings": {
    "analysis": {
      "filter": {
        "true_false_filter": {
          "type": "keep",
          "keep_words": [
            "true",
            "false"
          ]
        },
        "french_elision": {
          "type": "elision",
          "articles_case": false,
          "articles": [
            "puisqu"
          ]
        },
        "french_stemmer": {
          "type": "stemmer",
          "language": "light_french"
        },
        "organic-dictionnary": {
          "type": "synonym",
          "expand": true,
          "lenient": true,
          "synonyms": [
            "non bio"
          ]
        },
        "french_stop_filter": {
          "type": "stop",
          "ignore_case": true,
          "stopwords": "_french_"
        }
      },
      "analyzer": {
        "lowercase_stop_analyzer": {
          "tokenizer": "lowercase",
          "filter": [
            "french_stop_filter"
          ]
        },
        "lowercase_asciifolding": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "asciifolding",
            "lowercase"
          ]
        },
        "french_analyzer_custom": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "asciifolding",
            "lowercase",
            "french_elision",
            "french_stemmer"
          ]
        },
        "custom_organic_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "asciifolding",
            "lowercase",
            "french_elision",
            "organic-dictionnary",
            "true_false_filter",
            "unique"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "attr": {
        "type": "text",
        "analyzer": "french_analyzer_custom"
      },
      "brand_name": {
        "type": "keyword"
      },
      "brand_name_suggest": {
        "type": "completion",
        "analyzer": "lowercase_stop_analyzer",
        "search_analyzer": "lowercase_asciifolding",
        "preserve_separators": false,
        "preserve_position_increments": false,
        "max_input_length": 50
      }
    }
  }
}

然后我将一个文档放入索引中:

POST /movies/_doc/1001
{
    "brand_name": "A LE MOUTON HUILE D'OLIVE",
    "brand_name_suggest": [
      "A LE MOUTON HUILE D'OLIVE"
    ]
}

然后是我的搜索:

GET movies/_search
{
  "explain": true, 
  "suggest": {
    "completer": {
      "text": "amo",
      "completion": {
        "field": "brand_name_suggest",
        "size": 20,
        "skip_duplicates": true
      }
    }
  }
}

我的问题:为什么在搜索“amo”时找到此文档?

如何防止退货?

提前致谢

elasticsearch autocomplete elasticsearch-suggester
1个回答
1
投票

由于

brand_name_suggest
使用
lowercase_stop_analyzer
删除法语停用词,因此
A LE MOUTON HUILE D'OLIVE
将被分析为
a, mouton, huile, olive
,即
LE
被删除。

因此,在搜索时,当您输入

amo
时,它会匹配前两个标记,这就是您获得此文档的原因。如果您想防止这种情况发生,您需要从索引时间分析器中删除
french_stop_filter

除了以后可能会困扰您的另一个问题是您的搜索分析器

lowercase_asciifolding
会进行asciifolding,但您的索引时间分析器不会,因此如果您使用重音索引单词,您可能也无法在搜索时找到它们。

© www.soinside.com 2019 - 2024. All rights reserved.