Elasticsearch查询不适用于@ value

问题描述 投票:0回答:1

当我在电子邮件上执行简单的搜索查询时,它不会向我返回任何内容,除非我删除“@”后面的内容,为什么?

我希望在模糊和自动完成中对电子邮件进行查询。

ELASTICSEARCH INFOS:

{
  "name" : "ZZZ",
  "cluster_name" : "YYY",
  "cluster_uuid" : "XXX",
  "version" : {
    "number" : "6.5.2",
    "build_flavor" : "default",
    "build_type" : "tar",
    "build_hash" : "WWW",
    "build_date" : "2018-11-29T23:58:20.891072Z",
    "build_snapshot" : false,
    "lucene_version" : "7.5.0",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}

映射:

PUT users
{
  "mappings":
  {
    "_doc": { "properties": { "mail": { "type": "text" } } }
  }
}

所有DATAS:

[
    { "mail": "[email protected]" },
    { "mail": "[email protected]" }
]

查询作品:

期限请求有效,但mail == "[email protected]"而不是“firstname.lastname”...

QUERY :
GET users/_search
{ "query": { "term": { "mail": "firstname.lastname" } }}

RETURN :
{
  "took": 7,
  "timed_out": false,
  "_shards": { "total": 6, "successful": 6, "skipped": 0, "failed": 0 },
  "hits": {
    "total": 1,
    "max_score": 4.336203,
    "hits": [
      {
        "_index": "users",
        "_type": "_doc",
        "_id": "H1dQ4WgBypYasGfnnXXI",
        "_score": 4.336203,
        "_source": {
          "mail": "[email protected]"
        }
      }
    ]
  }
}

查询不起作用:

QUERY :
GET users/_search
{ "query": { "term": { "mail": "[email protected]" } }}

RETURN :
{
  "took": 0,
  "timed_out": false,
  "_shards": { "total": 6, "successful": 6, "skipped": 0, "failed": 0 },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

解决方案

使用uax_url_email分析器更改映射(映射更改后重新索引)以获取邮件。

PUT users
{
  "settings":
  {
    "index": { "analysis": { "analyzer": { "mail": { "tokenizer":"uax_url_email" } } } }
  }
  "mappings":
  {
    "_doc": { "properties": { "mail": { "type": "text", "analyzer":"mail" } } }
  }
}
elasticsearch
1个回答
1
投票

如果你没有为你的索引文本字段使用其他标记器,它将使用标准标记器,它在@符号上标记[我没有这个源,但下面有证据]。

如果您使用术语查询而不是匹配查询,则将在倒排索引elasticsearch match vs term query中搜索该确切术语。

你的倒排索引看起来像这样

GET users/_analyze
{
  "text": "[email protected]"
}

{
  "tokens": [
    {
      "token": "firstname.lastname",
      "start_offset": 0,
      "end_offset": 18,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "company.com",
      "start_offset": 19,
      "end_offset": 30,
      "type": "<ALPHANUM>",
      "position": 1
    }
  ]
}

要解决此问题,您可以为邮件字段指定自己的分析器,或者您可以使用匹配查询,它将分析您搜索的文本,就像分析索引文本一样。

GET users/_search
{
  "query": {
    "match": {
      "mail": "[email protected]"
    }
  }
}
© www.soinside.com 2019 - 2024. All rights reserved.