当我查询中的范围超过范围时,elasticsearch Restful 查询中范围的奇怪行为

问题描述 投票:0回答:1

我使用的是elasticsearch 8.8.0

我目前有这个疑问

{
    "query": {
        "bool": {
            "must": [
                {
                    "range": {
                        "deliveryStatus": {
                            "gte": -2,
                            "lte": 9
                        }
                    }
                },
                {
                    "range": {
                        "cardDate": {
                            "gte": "2024-04-16",
                            "lte": "2024-04-16"
                        }
                    }
                }
            ]
        }
    }
}

这给了我这个结果

{
    "took": 2,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 828,
            "relation": "eq"
        },
        "max_score": 2.0,
        "hits": [
            {
                "_index": "credit",
                "_id": "70230254011604",
                "_score": 2.0,
                "_source": {
                    "cardDate": "2024-04-16T00:00:00",
                    "deliveryStatus": "3"
                }
            }
        ]
    }
}

但是,当我尝试将 DeliveryStatus 的上限从 9 增加到 10 时,如该查询所示

{
    "query": {
        "bool": {
            "must": [
                {
                    "range": {
                        "deliveryStatus": {
                            "gte": -2,
                            "lte": 10
                        }
                    }
                },
                {
                    "range": {
                        "cardDate": {
                            "gte": "2024-04-16",
                            "lte": "2024-04-16"
                        }
                    }
                }
            ]
        }
    }
}

根据我的理解,结果应该保持不变 - 但是当我击中它时;这就是我得到的结果

{
    "took": 1,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 0,
            "relation": "eq"
        },
        "max_score": null,
        "hits": []
    }
}

这很奇怪,不符合我目前的理解。出于好奇,我尝试删除 cardDate 以查看是否得到与此查询中所示相同的结果

{
    "query": {
        "bool": {
            "must": [
                {
                    "range": {
                        "deliveryStatus": {
                            "gte": -2,
                            "lte": 10
                        }
                    }
                }
            ]
        }
    }
}

我得到了这个结果

{
    "took": 1,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 10000,
            "relation": "gte"
        },
        "max_score": 1.0,
        "hits": [
            {
                "_index": "credit",
                "_id": "65391401441421011",
                "_score": 1.0,
                "_source": {
                    "cardDate": "2023-11-10T00:00:00"
                    "deliveryStatus": -1
                }
            }
        ]
    }
}

如果我希望deliveryStatus的上限为13,有人可以解释一下为什么如何来修复它吗(我尝试更改为10或以上,它显示了一个空列表)

/编辑:我读到这可能是索引中数据类型不匹配的问题。但是,如果是这样的话,那不是会使第一个查询也不起作用吗?我的elasticSearch中deliveryStatus的当前数据类型是“文本”,在我理解问题之前我不想删除我的索引,因为索引中已经有相当多的数据

//编辑:根据要求,下面是我的索引的映射

卡片日期


{
    "credit": {
        "mappings": {
            "cardDate": {
                "full_name": "cardDate",
                "mapping": {
                    "cardDate": {
                        "type": "date"
                    }
                }
            }
        }
    }
}

这是交货状态

{
    "credit": {
        "mappings": {
            "deliveryStatus": {
                "full_name": "deliveryStatus",
                "mapping": {
                    "deliveryStatus": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        },
                        "fielddata": true
                    }
                }
            }
        }
    }
}
elasticsearch
1个回答
0
投票

@Gagak

这是它的工作原理 -

范围查询基于字段类型。如果它是字符串类型,如keywordtext,那么它会按字母顺序对所有术语进行排序。

如果是数字字段类型,那么它将对所有数字值进行排序。

但就你而言 -

数值与类型keyword映射,因此它尝试按字母顺序排序,但最终返回错误的结果。

让我们通过一个示例来理解 - 使用 keyword 字段类型创建索引并索引一些数据。

PUT test-range
{
  "mappings": {
    "properties": {
      "message":{
        "type": "keyword"
      }
    }
  }
}
POST test-range/_doc
{
  "message":17
}

POST test-range/_doc
{
  "message":2
}

POST test-range/_doc
{
  "message":23
}

POST test-range/_doc
{
  "message":"ab"
}

POST test-range/_doc
{
  "message":"ac"
}

POST test-range/_doc
{
  "message":"ad"
}

POST test-range/_doc
{
  "message":"ac1"
}

让我们对文本进行范围查询 -

GET test-range/_search
{
  "query": {
    "range": {
      "message": {
        "gte": "ac",
        "lte": "ad"
      }
    }
  }
}

它会给你正确的结果,如下所示

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 3,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "test-range",
        "_id": "zfeT944BZFU3azzCOTSB",
        "_score": 1,
        "_source": {
          "message": "ac"
        }
      },
      {
        "_index": "test-range",
        "_id": "OF6T944BMvvuJ06pQGtI",
        "_score": 1,
        "_source": {
          "message": "ad"
        }
      },
      {
        "_index": "test-range",
        "_id": "3feT944BZFU3azzCjDQP",
        "_score": 1,
        "_source": {
          "message": "ac1"
        }
      }
    ]
  }
}

现在让我们对数字(关键字类型)进行范围查询 -

GET test-range/_search
{
  "query": {
    "range": {
      "message": {
        "gte": 17,
        "lte": 23
      }
    }
  }
}

回应

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 3,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "test-range",
        "_id": "OfeR944BZFU3azzCYDQ6",
        "_score": 1,
        "_source": {
          "message": 17
        }
      },
      {
        "_index": "test-range",
        "_id": "pV6R944BMvvuJ06paWrr",
        "_score": 1,
        "_source": {
          "message": 2
        }
      },
      {
        "_index": "test-range",
        "_id": "S_eR944BZFU3azzCdDSd",
        "_score": 1,
        "_source": {
          "message": 23
        }
      }
    ]
  }
}

这里尝试按字母顺序排序,这对于数字值来说是不可能的。因此它的显示顺序与您索引的顺序相同。例如,如果我尝试找到 2 - 23 之间的范围,它将不会返回 17。

解决方案

使用数字字段类型对数字执行范围查询,对于日期可以使用日期类型。添加正确的字段类型应该可以解决您的问题。

© www.soinside.com 2019 - 2024. All rights reserved.