Elasticsearch:按嵌套文档计数过滤

问题描述 投票:0回答:1

如何编写一个根据匹配嵌套文档数量进行过滤的查询?

我有一个 Elasticsearch 索引,其中包含

tasks
activities
。单个任务有许多活动。这些活动中的每一个都有与其关联的元数据。每个活动都是一个嵌套文档。

我想过滤所有恰好有 2 个活动符合我的活动过滤条件的任务。

我尝试使用脚本,但没有得到我期望的结果:

{
  "_source": false,
  "from": 0,
  "size": 50,
  "track_total_hits": true,
  "query": {
    "bool": {
      "must": [
        {
          "bool": {
            "must_not": {
              "exists": {
                "field": "deleted"
              }
            }
          }
        }
      ],
      "filter": [
        {
          "script_score": {
            "query": {
              "nested": {
                "path": "activities",
                "query": {
                  "bool": {
                    "must": [
                      {
                        "match": {
                          "activities.queue": "669820d5-f08a-4624-a18d-278f8d3ad70e"
                        }
                      },
                      {
                        "bool": {
                          "must_not": {
                            "exists": {
                              "field": "activities.assignee"
                            }
                          }
                        }
                      },
                      {
                        "match": {
                          "activities.status": "PENDING"
                        }
                      },
                      {
                        "bool": {
                          "must_not": {
                            "exists": {
                              "field": "activities.deleted"
                            }
                          }
                        }
                      }
                    ]
                  }
                },
                "inner_hits": {
                  "size": 100
                }
              }
            },
            "script": {
              "source": "int count = 0; for (def activity : doc['activities'].values) { if (activity.queue == params.queue && activity.status == params.status && !activity.containsKey('assignee') && !activity.containsKey('deleted')) { count++; } } return count == 2 ? 1 : 0;",
              "params": {
                "queue": "669820d5-f08a-4624-a18d-278f8d3ad70e",
                "status": "PENDING"
              }
            }
          }
        }
      ]
    }
  }
}

这仅返回单个活动的所有任务。

如何按嵌套文档计数进行过滤?有没有一种方法可以在没有脚本的情况下实现这一目标?

elasticsearch
1个回答
0
投票

尝试重新组织您的数据

让我们的活动是一个带有元数据的文档,任务是该文档中的一个字段

样本文件

PUT /activities/_bulk
{"create":{"_id":1}}
{"task": "t1", "queue": 1, "status": "pending", "is_deleted": true, "assignee": "Harry"}
{"create":{"_id":2}}
{"task": "t1", "queue": 1, "status": "pending", "is_deleted": false}
{"create":{"_id":3}}
{"task": "t1", "queue": 1, "status": "pending", "is_deleted": false}
{"create":{"_id":4}}
{"task": "t2", "queue": 1, "status": "pending", "is_deleted": false}
{"create":{"_id":5}}
{"task": "t2", "queue": 1, "status": "pending", "is_deleted": false}
{"create":{"_id":6}}
{"task": "t3", "queue": 1, "status": "pending", "is_deleted": false}

查询

GET /activities/_search?filter_path=aggregations.by_task.buckets.key
{
    "query": {
        "bool": {
            "filter": [
                {
                    "term": {
                        "is_deleted": false
                    }
                },
                {
                    "term": {
                        "queue": 1
                    }
                },
                {
                    "bool": {
                        "must_not": {
                            "exists": {
                                "field": "assignee"
                            }
                        }
                    }
                },
                {
                    "term": {
                        "status": "pending"
                    }
                }
            ]
        }
    },
    "aggs": {
        "by_task": {
            "terms": {
                "field": "task.keyword"
            },
            "aggs": {
                "filter_doc_count_2": {
                    "bucket_selector": {
                        "buckets_path": {
                            "acivity_count": "_count"
                        },
                        "script": "params.acivity_count == 2"
                    }
                }
            }
        }
    }
}

响应具有恰好 2 个活动的任务

{
    "aggregations" : {
        "by_task" : {
            "buckets" : [
                {
                    "key" : "t1"
                },
                {
                    "key" : "t2"
                }
            ]
        }
    }
}
© www.soinside.com 2019 - 2024. All rights reserved.