在弹性搜索中应用术语聚合之前如何对数据进行排序

问题描述 投票:0回答:1

我们希望在应用术语聚合之前使用几个字段对数据进行排序,尝试使用术语并对子聚合进行排序,使用哪种排序有效,但是术语聚合给出了重复的记录。

弹性搜索版本:8.7.1

示例弹性搜索文档:

{
  "itemDetails": {
    "itemId": "3076",
    "itemId2": "1003918865",
    "usecase": "habc",
    "usecaseId": "xyz"
  },
  "metaData": {
    "cId": "96ff54507c2d018e5c767785c705a5b2",
    "date1": "2023-09-29T12:29:54",
    "date2": "2023-09-29T12:30:09"
  }
other properties....
}

在上面的文档中,首先我想使用 date1 和 date2 对记录进行排序,然后需要使用“cId”获取不同的记录。

使用下面的查询能够根据 date1 和 date2 对数据进行排序,但是我们得到具有重复 cId 的记录:

查询1:

POST /index/_search?typed_keys=true
{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "terms": {
            "itemDetails.itemId": [
              "3076"
            ]
          }
        },
        {
          "terms": {
            "itemDetails.usecase": [
              "habc"
            ]
          }
        },
        {
          "range": {
            "metaData.date1": {
              "lte": "2023-09-30T19:55:54.611Z",
              "gte": "2023-09-27T19:55:54.611Z"
            }
          }
        }
      ]
    }
  },
  "aggs": {
    "sortby_date1": {
      "terms": {
        "field": "metaData.date1",
        "order": {
          "_key": "desc"
        },
        "size": 6
      },
      "aggs": {
        "sortby_date2": {
          "terms": {
            "field": "metaData.date2",
            "order": {
              "_key": "desc"
            }
          },
          "aggs": {
            "groupby_cId": {
              "terms": {
                "field": "metaData.cId"
              },
              "aggs": {
                "top_doc": {
                  "top_hits": {
                    "size": 1,
                    "_source": {
                      "includes": [
                        "itemDetails.itemId",
                        "itemDetails.usecase",
                        "metaData.cId",
                        "metaData.date1"
                      
                      ]
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

Query2:与排序一起使用时,术语聚合不会给出唯一记录。

{
  "query": {
    "bool": {
      "must": [
        {
          "terms": {
            "itemDetails.itemId": [
              "3077"
            ]
          }
        },
        {
          "terms": {
            "itemDetails.usecase": [
              "xyz"
            ]
          }
        },
        {
          "range": {
            "metaData.date1": {
              "lte": "2023-09-30T19:55:54.611Z",
              "gte": "2023-09-27T19:55:54.611Z"
            }
          }
        }
      ]
    }
  },
  "sort": [
    {
      "metaData.date1": {
        "order": "desc"
      }
    },
    {
      "metaData.date2": {
        "order": "desc"
      }
    }
  ],
  "aggs": {
    "distinct_cIds": {
      "terms": {
        "field": "metaData.cId"
      },
      "aggs": {
        "top_doc": {
          "top_hits": {
            "size": 1,
            "_source": {
              "includes": [
                        "itemDetails.itemId",
                        "itemDetails.usecase",
                        "metaData.cId",
                        "metaData.date1"
              ]
            }
          }
        }
      }
    }
  }
}
elasticsearch elasticsearch-aggregation elasticsearch-8
1个回答
0
投票

你就快到了。您应该简单地按

cId
进行聚合以获得每个记录的唯一记录,然后返回按降序
date1
date2
排序的最高命中。

请注意,顶级排序仅适用于命中,对聚合没有影响。当您因为只对聚合结果(即唯一的 cId)感兴趣而将 size 设置为 0 时,顶级排序对您来说没有任何价值,但您可以将其添加到

top_hits
聚合中:

POST /index/_search?typed_keys=true
{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "terms": {
            "itemDetails.itemId": [
              "3076"
            ]
          }
        },
        {
          "terms": {
            "itemDetails.usecase": [
              "habc"
            ]
          }
        },
        {
          "range": {
            "metaData.date1": {
              "lte": "2023-09-30T19:55:54.611Z",
              "gte": "2023-09-27T19:55:54.611Z"
            }
          }
        }
      ]
    }
  },
  "aggs": {
    "distinct_cIds": {
      "terms": {
        "field": "metaData.cId"
      },
      "aggs": {
        "top_doc": {
          "top_hits": {
            "size": 1,
            "sort": [
               {"metaData.date1": "desc"},           <---- add this sort
               {"metaData.date2": "desc"},           <---- add this sort
            ],
            "_source": {
              "includes": [
                        "itemDetails.itemId",
                        "itemDetails.usecase",
                        "metaData.cId",
                        "metaData.date1"
              ]
            }
          }
        }
      }
    }
  }
}
© www.soinside.com 2019 - 2024. All rights reserved.