Elasticsearch 中如何处理复合词

问题描述 投票:0回答:1

我知道elasticsearch中有一个很好的Compound Word Token Filter,但我的问题有点不同。我想知道像谷歌这样的搜索引擎如何处理开放形式的复合词,如“邮局”或“客厅”。如果您输入“邮局”而不是“邮局”,您仍然会得到相同的结果。我想在我的 Elasticsearch 搜索引擎中拥有这样的功能。这个问题的解决办法是什么?我应该将邮局代币化为一个代币吗?如果是真的,怎么办?

java elasticsearch search-engine information-retrieval
1个回答
0
投票

您应该添加一个分析器来搜索查询

请参阅我的回答

中的映射和文档

复合查询

"something"

GET /decompounder/_search?filter_path=hits.hits
{
        "query": {
                "multi_match" : {
                        "query": "something",
                        "analyzer": "lowercase_english_decompounder_standard_analyzer", 
                        "fields": ["name"]
                }
        }
}

回应

{
    "hits" : {
        "hits" : [
            {
                "_index" : "decompounder",
                "_type" : "_doc",
                "_id" : "1",
                "_score" : 0.23911434,
                "_source" : {
                    "name" : "something sea"
                }
            },
            {
                "_index" : "decompounder",
                "_type" : "_doc",
                "_id" : "2",
                "_score" : 0.23911434,
                "_source" : {
                    "name" : "something tea"
                }
            },
            {
                "_index" : "decompounder",
                "_type" : "_doc",
                "_id" : "3",
                "_score" : 0.23911434,
                "_source" : {
                    "name" : "something seaside"
                }
            }
        ]
    }
}

用两个词查询

"some thing"

GET /decompounder/_search?filter_path=hits.hits
{
        "query": {
                "multi_match" : {
                        "query": "some thing",
                        "analyzer": "lowercase_english_decompounder_standard_analyzer", 
                        "fields": ["name"]
                }
        }
}

反应是一样的

© www.soinside.com 2019 - 2024. All rights reserved.