当我在电子邮件上执行简单的搜索查询时,它不会向我返回任何内容,除非我删除“@”后面的内容,为什么?
我希望在模糊和自动完成中对电子邮件进行查询。
ELASTICSEARCH INFOS:
{
"name" : "ZZZ",
"cluster_name" : "YYY",
"cluster_uuid" : "XXX",
"version" : {
"number" : "6.5.2",
"build_flavor" : "default",
"build_type" : "tar",
"build_hash" : "WWW",
"build_date" : "2018-11-29T23:58:20.891072Z",
"build_snapshot" : false,
"lucene_version" : "7.5.0",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"
}
映射:
PUT users
{
"mappings":
{
"_doc": { "properties": { "mail": { "type": "text" } } }
}
}
所有DATAS:
[
{ "mail": "[email protected]" },
{ "mail": "[email protected]" }
]
查询作品:
期限请求有效,但mail == "[email protected]"
而不是“firstname.lastname”...
QUERY :
GET users/_search
{ "query": { "term": { "mail": "firstname.lastname" } }}
RETURN :
{
"took": 7,
"timed_out": false,
"_shards": { "total": 6, "successful": 6, "skipped": 0, "failed": 0 },
"hits": {
"total": 1,
"max_score": 4.336203,
"hits": [
{
"_index": "users",
"_type": "_doc",
"_id": "H1dQ4WgBypYasGfnnXXI",
"_score": 4.336203,
"_source": {
"mail": "[email protected]"
}
}
]
}
}
查询不起作用:
QUERY :
GET users/_search
{ "query": { "term": { "mail": "[email protected]" } }}
RETURN :
{
"took": 0,
"timed_out": false,
"_shards": { "total": 6, "successful": 6, "skipped": 0, "failed": 0 },
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
解决方案
使用uax_url_email
分析器更改映射(映射更改后重新索引)以获取邮件。
PUT users
{
"settings":
{
"index": { "analysis": { "analyzer": { "mail": { "tokenizer":"uax_url_email" } } } }
}
"mappings":
{
"_doc": { "properties": { "mail": { "type": "text", "analyzer":"mail" } } }
}
}
如果你没有为你的索引文本字段使用其他标记器,它将使用标准标记器,它在@符号上标记[我没有这个源,但下面有证据]。
如果您使用术语查询而不是匹配查询,则将在倒排索引elasticsearch match vs term query中搜索该确切术语。
你的倒排索引看起来像这样
GET users/_analyze
{
"text": "[email protected]"
}
{
"tokens": [
{
"token": "firstname.lastname",
"start_offset": 0,
"end_offset": 18,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "company.com",
"start_offset": 19,
"end_offset": 30,
"type": "<ALPHANUM>",
"position": 1
}
]
}
要解决此问题,您可以为邮件字段指定自己的分析器,或者您可以使用匹配查询,它将分析您搜索的文本,就像分析索引文本一样。
GET users/_search
{
"query": {
"match": {
"mail": "[email protected]"
}
}
}