所以我一直在使用elasticsearch,我遇到了一个问题,我正在努力使用正则表达式来增强我的匹配,例如,如果我查询“文档524106”,我希望它匹配该字段,那么如果匹配的数字与我的正则表达式匹配
[0-9]{5,}
分数将会提高。
所以我尝试将我的匹配查询与正则表达式结合起来,但这行不通,无论我做什么,正则表达式都会增强任何包含与正则表达式匹配的数字的文档,我尝试做一个必须包含匹配和正则表达式的文档,我尝试使用regexp 作为过滤器,我尝试将两者分开,有人能帮我弄清楚它是否可能吗?
我尝试了多个查询,包括必须查询、过滤器,将两者结合起来,所以我最后尝试的是这个,但仍然没有得到我正在考虑的解决方案
content_query.append(
{
"bool": {
"must": [
{
"match": {
"document.number".format(lang): {
"query": token
}
}
},
{
"regexp": {
"document.number".format(lang): {
"value": "[0-9]{5,}",
"boost": 5
}
}
}
],
"filter": {
"bool": {
"must": [
{
"regexp": {
"document.number".format(lang): {
"value": "[0-9]{5,}"
}
}
},
{
"match": {
"document.number".format(lang): {
"query": token
}
}
}
]
}
}
}
}
)
举更多例子: 假设我有 3 个文档
Document 1
Title : this is document 1
document: I have 56000 dollars in my bank account
Document 2
Title : this is document 2
document: I have 123695 dollars in my bank account
Document 3
Title : this is document 3
document: I have 52.85 dollars in my bank account
我索引的方式是将文本与数字分开,因此要查询的 document.number 。我用来索引数字的正则表达式是
"\d+\s*\-?\.?\s*\d+"
如果我使用匹配和正则表达式查询“56000 美元”,它应该与匹配查询和正则表达式匹配 56000,这样它就会得到提升,出于某种原因,正则表达式也会提升所有其他数字,这不是我想要的
如果我正确理解你的问题,那么以下查询就是一个解决方案
GET /regexp_fields/_search?filter_path=hits.hits
{
"query": {
"dis_max": {
"queries": [
{
"regexp": {
"text": "[0-9]{5,}"
}
},
{
"match": {
"text": "dollars"
}
}
],
"tie_breaker": 0.8
}
}
}
文件
PUT /regexp_fields/_bulk
{"create":{"_id":1}}
{"title" : "this is document 1", "text": "I have 56000 dollars in my bank account"}
{"create":{"_id":2}}
{"title" : "this is document 2", "text": "I have 123695 dollars in my bank account"}
{"create":{"_id":3}}
{"title" : "this is document 3", "text": "I have 52.85 dollars in my bank account"}
回应
{
"hits" : {
"hits" : [
{
"_index" : "regexp_fields",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.1068251,
"_source" : {
"title" : "this is document 1",
"text" : "I have 56000 dollars in my bank account"
}
},
{
"_index" : "regexp_fields",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.1068251,
"_source" : {
"title" : "this is document 2",
"text" : "I have 123695 dollars in my bank account"
}
},
{
"_index" : "regexp_fields",
"_type" : "_doc",
"_id" : "3",
"_score" : 0.13353139,
"_source" : {
"title" : "this is document 3",
"text" : "I have 52.85 dollars in my bank account"
}
}
]
}
}