Elasticsearch 按热门点击参数中的字段排序

Question

我正在尝试按搜索查询 Elasticsearch 中的 top_hits 参数对数据进行排序，但不知何故它没有影响任何内容。谁能帮我解决这个问题吗？

所以我尝试使用排序，就像有些人这样说的：

{
    "size" : 0,
    "from" : 0,
    "aggs": {
        "by_filter": {
            "filter": {
                "bool": {
                    "must": [
                    {
                        "range": {
                            "published_at": {
                                "gte": "2019-08-01 00:00:00",
                                "lte": "2023-10-30 23:59:59"
                            }
                        }
                    },
                    {
                        "match": {
                            "status": "published"
                        }
                    }
                    ]
                }
            },
            "aggs": {
                "by_created": {
                    "terms": {
                        "field": "created_by.id",
                        "size": 10
                    },
                    "aggs" : {
                        "count_data": {
                            "terms": {
                                "field": "created_by.id"
                            }
                        },
                        "hits": {
                            "top_hits": {
                                "sort": [                         <---- the sort query that I found
                                    {
                                        "created_by.name.keyword": {
                                            "order": "desc"
                                        }
                                    }
                                ],
                                "_source":["created_by.name"],
                                "size": 1
                            }
                        }
                    }
                }
            }
        }
    }
}

但结果没有改变:


"aggregations": {
    "by_filter": {
        "doc_count": 21,
        "by_created": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 3,
            "buckets": [
                {
                    "key": 34,
                    "doc_count": 3,
                    "hits": {
                        "hits": {
                            "total": {
                                "value": 3,
                                "relation": "eq"
                            },
                            "max_score": null,
                            "hits": [
                                {
                                    "_index": "re_article",
                                    "_id": "53822",
                                    "_score": null,
                                    "_source": {
                                        "created_by": {
                                            "name": "Edwin"
                                        }
                                    },
                                    "sort": [                <--- I think this is the result of the sort
                                        "Edwin"
                                    ]
                                }
                            ]
                        }
                    },
                    "count_data": {
                        "doc_count_error_upper_bound": 0,
                        "sum_other_doc_count": 0,
                        "buckets": [
                            {
                                "key": 34,
                                "doc_count": 3
                            }
                        ]
                    }
                },
                {
                    "key": 52,
                    "doc_count": 3,
                    "hits": {
                        "hits": {
                            "total": {
                                "value": 3,
                                "relation": "eq"
                            },
                            "max_score": null,
                            "hits": [
                                {
                                    "_index": "re_article",
                                    "_id": "338610",
                                    "_score": null,
                                    "_source": {
                                        "created_by": {
                                            "name": "Tito"
                                        }
                                    },
                                    "sort": [
                                        "Tito"
                                    ]
                                }
                            ]
                        }
                    },
                    "count_data": {
                        "doc_count_error_upper_bound": 0,
                        "sum_other_doc_count": 0,
                        "buckets": [
                            {
                                "key": 52,
                                "doc_count": 3
                            }
                        ]
                    }
                }
            ]
        }
    }
}

我期望的是，如果可能的话，存储桶首先显示名为“Tito”创建的数据，然后显示“Edwin”，如下所示：


"aggregations": {
    "by_filter": {
        "doc_count": 21,
        "by_created": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 3,
            "buckets": [
                {
                    "key": 52,
                    "doc_count": 3,
                    "hits": {
                        "hits": {
                            "total": {
                                "value": 3,
                                "relation": "eq"
                            },
                            "max_score": null,
                            "hits": [
                                {
                                    "_index": "re_article",
                                    "_id": "338610",
                                    "_score": null,
                                    "_source": {
                                        "created_by": {
                                            "name": "Tito"
                                        }
                                    }
                                }
                            ]
                        }
                    },
                    "count_data": {
                        "doc_count_error_upper_bound": 0,
                        "sum_other_doc_count": 0,
                        "buckets": [
                            {
                                "key": 52,
                                "doc_count": 3
                            }
                        ]
                    }
                },
                {
                    "key": 34,
                    "doc_count": 3,
                    "hits": {
                        "hits": {
                            "total": {
                                "value": 3,
                                "relation": "eq"
                            },
                            "max_score": null,
                            "hits": [
                                {
                                    "_index": "re_article",
                                    "_id": "53822",
                                    "_score": null,
                                    "_source": {
                                        "created_by": {
                                            "name": "Edwin"
                                        }
                                    }
                                }
                            ]
                        }
                    },
                    "count_data": {
                        "doc_count_error_upper_bound": 0,
                        "sum_other_doc_count": 0,
                        "buckets": [
                            {
                                "key": 34,
                                "doc_count": 3
                            }
                        ]
                    }
                }
            ]
        }
    }
}

我认为我选择了一个错误的例子，因为 top_hits 结果中有新的“排序”字段，但并不是我真正想要的，有人可以帮忙吗？谢谢你

这是我拥有的数据示例：

{
    "id": 53822,
    "created_at": "2019-09-03 18:17:13",
    "published_at": "2019-09-04 01:17:13",
    "status": "published",
    "created_by": {
        "id": 34,
        "name": "Edwin",
        "role_id": 4,
        "is_active": "Y"
},
{
    "id": 338610,
    "created_at": "2022-10-16 20:48:39",
    "published_at": "2022-10-16 21:08:12",
    "status": "published",
    "created_by": {
        "id": 52,
        "name": "Tito",
        "role_id": 4,
        "is_active": "Y"
},
{
    "id": 54272,
    "created_at": "2019-09-10 08:28:57",
    "published_at": "2019-09-10 15:30:03",
    "status": "published",
    "created_by": {
        "id": 34,
        "name": "Edwin",
        "role_id": 4,
        "is_active": "Y"
}

我尝试按字段created_by.id与count_data aggs进行分组，然后按created_by.name对结果进行排序，这就是为什么我包含顶部点击参数，因为我需要显示人名，而不仅仅是人名编号

另外，我需要按created_by.id分组的密钥，而不是按created_by.name分组，即使相同的id总是具有相同的名称

Answer 1

不幸的是，没有好的方法可以做到这一点。有多种方法可以按子聚合的数值对聚合进行排序，但在您的情况下，您需要按字符串值排序。我将按照我的偏好从最差到最好的顺序列出一些可能的解决方法：

使用运行时字段，您可以将密钥组合为
```
Edwin:34
```
并使用此
```
Tito:52
```
并在此运行时字段上运行
```
terms
```
聚合。这里的问题是，您需要解析应用程序中的密钥，如果您更改没有 id 的名称，它将生成两个存储桶而不是一个。
由于您已经在进行后处理，因此您可以使用按名称运行术语聚合并使用 top_hits 来检索 id 并使用此 id 进行查找。这个解决方案的问题是，如果你有不同 ID 的匹配名称或者名称发生更改，它就会崩溃。
由于您已经在应用程序中进行后处理，因此您可以按照示例中的方式检索名称，然后对应用程序中的存储桶进行排序。

我知道这不是您正在寻找的解决方案，但据我所知，鉴于聚合框架当前的限制，这是我们能做的最好的事情。

Elasticsearch 按热门点击参数中的字段排序

问题描述投票：0回答：1

1个回答

最新问题

Elasticsearch 按热门点击参数中的字段排序

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1