如果搜索字符串比搜索字段长，则在文档上不匹配

Question

我正在寻找一个标题

标题是，并且作为“警察日记：斯蒂芬·茨威格”

[当我搜索“警察”时我得到结果。但是当我搜寻警察时我没有得到结果。

这里是查询：

{
  "query": {
    "bool": {
      "should": [
        {
          "multi_match": {
            "fields": [
              "title",
              omitted because irrelevance...
            ],
            "query": "Policeman",
            "fuzziness": "1.5",
            "prefix_length": "2"
          }
        }
      ],
      "must": {
        omitted because irrelevance...
      }
    }
  },
  "sort": [
    {
      "_score": {
        "order": "desc"
      }
    }
  ]
}

这里是映射

{
    "books": {
        "mappings": {
            "book": {
                "_all": {
                    "analyzer": "nGram_analyzer", 
                    "search_analyzer": "whitespace_analyzer"
                },
                "properties": {
                    "title": {
                        "type": "text",
                        "fields": {
                            "raw": {
                                "type": "keyword"
                            },
                            "sort": {
                                "type": "text",
                                "analyzer": "to order in another language, (creates a string with symbols)",
                                "fielddata": true
                            }
                        }
                    }
                }
            }
        }
    }
}

应注意，我的文件标题为“某些标题”如果我搜索“某人的标题”，它将获得成功。

我不知道为什么警察书没有出现。

Answer 1

所以您的问题有两个部分。

搜索police时要搜索包含policeman的标题。
想知道为什么some title文档与someone title文档匹配，并且据此您希望第一个文档也匹配。

让我先解释一下为什么第二个查询匹配，为什么第一个查询不匹配，然后告诉您如何使第一个查询正常工作。

您的包含some title的文档将创建以下标记，您可以使用analyzer API进行验证。

POST /_analyze

{
    "text": "some title",
    "analyzer" : "standard" --> default analyzer for text field
}

生成的令牌

{
    "tokens": [
        {
            "token": "some",
            "start_offset": 0,
            "end_offset": 4,
            "type": "<ALPHANUM>",
            "position": 0
        },
        {
            "token": "title",
            "start_offset": 5,
            "end_offset": 10,
            "type": "<ALPHANUM>",
            "position": 1
        }
    ]
}

现在使用someone title搜索match query which is analyzed并使用与index time字段相同的分析仪。

因此它创建了2个标记someone和title，并且匹配查询与title标记匹配，这就是它出现在搜索结果中的原因，您还可以使用Explain API进行验证并查看内部结构详细匹配。

搜索`police`时如何带`policeman`标题>

您需要使用synonyms token filter，如下例所示。

索引定义

{
    "settings": {
        "analysis": {
            "analyzer": {
                "synonyms": {
                    "filter": [
                        "lowercase",
                        "synonym_filter"
                    ],
                    "tokenizer": "standard"
                }
            },
            "filter": {
                "synonym_filter": {
                    "type": "synonym",
                    "synonyms" : ["policeman => police"] --> note this
                }
            }
        }
    },
    "mappings": {
        "properties": {
            "": {
                "type": "text",
                "analyzer": "synonyms"
            }
        }
    }
}

索引样本文档

{
  "dialog" : "police"
}

具有词条policeman的搜索查询

{
    "query": {
        "match" : {
            "dialog" : {
                "query" : "policeman"
            }
        }
    }
}

和搜索结果

 "hits": [
      {
        "_index": "so_syn",
        "_type": "_doc",
        "_id": "1",
        "_score": 0.2876821,
        "_source": {
          "dialog": "police" --> note source has `police` only.
        }
      }
    ]

如果搜索字符串比搜索字段长，则在文档上不匹配

问题描述投票：0回答：1

1个回答

生成的令牌

搜索`police`时如何带`policeman`标题>

索引样本文档
`{ "dialog" : "police" }`

具有词条`policeman`的搜索查询
`{ "query": { "match" : { "dialog" : { "query" : "policeman" } } } }`

和搜索结果
"hits": [ { "_index": "so_syn", "_type": "_doc", "_id": "1", "_score": 0.2876821, "_source": { "dialog": "police" --> note source has `police` only. } } ]

最新问题

如果搜索字符串比搜索字段长，则在文档上不匹配

问题描述 投票：0回答：1

1个回答

生成的令牌

搜索police时如何带policeman标题>

索引样本文档{ "dialog" : "police" }

具有词条policeman的搜索查询{ "query": { "match" : { "dialog" : { "query" : "policeman" } } } }

和搜索结果 "hits": [ { "_index": "so_syn", "_type": "_doc", "_id": "1", "_score": 0.2876821, "_source": { "dialog": "police" --> note source has `police` only. } } ]

最新问题

问题描述投票：0回答：1

搜索`police`时如何带`policeman`标题>

索引样本文档
`{ "dialog" : "police" }`

具有词条`policeman`的搜索查询
`{ "query": { "match" : { "dialog" : { "query" : "policeman" } } } }`

和搜索结果
"hits": [ { "_index": "so_syn", "_type": "_doc", "_id": "1", "_score": 0.2876821, "_source": { "dialog": "police" --> note source has `police` only. } } ]