date_histogram 上脚本化指标的 Elasticsearch 文档空指针异常错误

问题描述 投票:0回答:1

在 elastisearch 中,我有一个键,它是一个数组,我需要计算数组中的唯一项。没有与其他文件重叠的项目。 所以我做了这个:

scripted_metric 不在 date_histogram 中时才起作用

'aggs' => [
    'groupByWeek' => [
        'date_histogram' => [
            'field' => 'date',
            'calendar_interval' => '1w',
        ],
        'aggs' => [
            'count_unique_locations' => [
                'scripted_metric' => [
                     'init_script' => 'state.locations = []',
                     'map_script' => 'state.locations.addAll(doc.unique_locations_with_error)',
                     'combine_script' => 'return state.locations',
                     'reduce_script' => "
                         def locations = [];
                         for (state in states) {
                             for(location in state) {
                                 if(!locations.contains(location) && location != '' ) {
                                     locations.add(location); 
                                 }
                             }
                        }
                        return locations.length;
                    ",
                ],
            ],
        ],
    ],
],

当我运行查询时,出现此错误:

{
  "error": {
    "root_cause":[],
    "type":"search_phase_execution_exception",
    "reason":"",
    "phase":"fetch",
    "grouped":true,
    "failed_shards":[],
    "caused_by":{
      "type":"script_exception",
      "reason":"runtime error",
      "script_stack":[
        "locations = []; ","            ^---- HERE"],
        "script":"def locations = []; for (state in states) { for(location in state){if(!locations.contains(location) && location != '' ) {locations.add(location); }}} return locations.length;",
      "lang":"painless",
      "position":{
        "offset":16,
        "start":4,
        "end":20
      },
      "caused_by":{
       "type":"null_pointer_exception",
       "reason":"cannot access method/field [iterator] from a null def reference"
      }
    }
  },
  "status":400
}

我认为这与该文档现在为空有关,但我不知道为什么如何修复它。

elasticsearch
1个回答
0
投票

如果您想计算某个日期范围内文档中唯一的数组项,您可以使用无脚本聚合

映射

PUT /unique_array_item
{
    "mappings": {
        "properties": {
            "text": {
                "type": "keyword"
            },
            "date": {
                "type": "date"
            }
        }   
    }
}

文件

PUT /unique_array_item/_bulk
{"create":{"_id":1}}
{"date":"2024-03-28","text":["banana","banana"]}
{"create":{"_id":2}}
{"date":"2024-03-28","text":["apple","apple","apple","apple","apple","apple","banana"]}
{"create":{"_id":3}}
{"date":"2024-03-14","text":["cherry","banana","apple"]}
{"create":{"_id":4}}
{"date":"2024-03-14","text":["pineapple"]}

聚合查询

GET /unique_array_item/_search?filter_path=aggregations
{
    "aggs": {
        "by_week": {
            "date_histogram": {
                "field": "date",
                "calendar_interval": "1w",
                "min_doc_count": 1
            },
            "aggs": {
                "date_interval_unique_array_item_count": {
                    "cardinality": {
                        "field": "text"
                    }
                }
            }
        }
    }
}

回应

{
    "aggregations" : {
        "by_week" : {
            "buckets" : [
                {
                    "key_as_string" : "2024-03-11T00:00:00.000Z",
                    "key" : 1710115200000,
                    "doc_count" : 2,
                    "date_interval_unique_array_item_count" : {
                        "value" : 4
                    }
                },
                {
                    "key_as_string" : "2024-03-25T00:00:00.000Z",
                    "key" : 1711324800000,
                    "doc_count" : 2,
                    "date_interval_unique_array_item_count" : {
                        "value" : 2
                    }
                }
            ]
        }
    }
}

测试一下:

  • “2024-03-11T00:00:00.000Z”:
    ["cherry","banana","apple","pineapple"]
    = 4
  • “2024-03-25T00:00:00.000Z”:
    ["apple","banana"]
    = 2

回复正确

© www.soinside.com 2019 - 2024. All rights reserved.