我有以下索引数据
{
"took": 8,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 7992,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "test_index",
"_id": "33952",
"default_fee": 12,
"custom_dates": [
{
"date": "2023-11-01",
"price": 100
},
{
"date": "2023-11-02",
"price": 50
}
],
"options": [
{
"id": 95,
"cost": 5,
"type": [
"Car"
]
}
]
}
]
}
}
我添加了一个脚本字段作为总计,以计算运行时的总计,如下所示
{
script_fields: {
total: {
script: {
source: "
DateTimeFormatter formatter = DateTimeFormatter.ofPattern('yyyy-MM-dd');
def from = LocalDate.parse(params.checkin, formatter);
def to = LocalDate.parse(params.checkout, formatter);
def stay = params.total_stay;
def custom_price_dates = [];
if (params['_source']['custom_dates'] != null && !params['_source']['custom_dates'].isEmpty()) {
custom_price_dates = params['_source']['custom_dates'].stream()
.filter(filter_doc -> {
def date = LocalDate.parse(filter_doc.start_date, formatter);
return !date.isBefore(from) && !date.isAfter(to.minusDays(1));
})
.collect(Collectors.toList());
}
def custom_price = custom_price_dates.stream().mapToDouble(custom_doc -> custom_doc.price).sum();
def default_price = stay == custom_price_dates.size() ? 0 : (stay - custom_price_dates.size()) * params['_source']['default_fee'];
def calc_price = default_price + custom_price;
return calc_price;
",
params: {
checkin: Date.current.to_s,
checkout: Date.current.to_s,
total_stay: 2
}
}
}
},
_source: ["*"]
}
这将返回脚本字段的总计。现在我想根据上述总数的范围进行过滤。我该如何实现它?我尝试使用脚本查询如下,但它不会循环遍历 custom_dates,因为它是嵌套类型。
此外,我之前无法索引总计,因为入住和退房日期是动态的,并且给定的入住和退房日期可能有自定义价格。请推荐。
这个可以做到,但是比较复杂。首先,我们需要了解这个搜索是分两个阶段执行的——查询和获取。在查询阶段,每个分片使用其排序键(默认为 _score)收集前 10 个命中,在获取阶段,协调节点从所有分片收集这些 id 和排序键,从中选择前 10 个,然后要求每个分片返回那里文件。脚本字段是在获取阶段计算的,因此过滤器无法访问它们。
更糟糕的是,您将自定义日期索引为嵌套对象。在内部,嵌套对象作为单独的对象进行索引,将信息从它们传递到主查询的唯一方法是通过 _score。因此,基本上,为了实现您想要通过嵌套对象实现的目标,您需要将价格编码到 _score 中。为了简化计算,我们需要在嵌套对象中存储价格差异而不是实际价格。所以如果默认价格是12,特价是100,我们需要存储88。
然后我们可以找到与我们的日期范围匹配的所有嵌套对象:
{
"nested": {
"path": "custom_dates",
"query": {
"range": {
"custom_dates.start_date": {
"gte": "2023-10-31",
"lte": "2023-11-02"
}
}
}
}
}
然后我们可以将其包装到脚本得分中,它将用价格替换得分:
{
"nested": {
"path": "custom_dates",
"query": {
"script_score": {
"script": {
"source": "doc['custom_dates.price_adjustment'].value"
},
"query": {
"range": {
"custom_dates.start_date": {
"gte": "2023-10-31",
"lte": "2023-11-02"
}
}
}
}
},
"score_mode": "sum"
}
}
然后我们可以使用另一个
script_score
来计算默认价格:
{
"script_score": {
"script": {
"params": {
"total_stay": 3
},
"source": "doc['default_fee'].value * params.total_stay"
},
"query": {
"match_all": {}
}
}
}
然后我们可以将它们组合在一起形成两个加分的should子句。
所以,现在我们的 _score 等于分配给每条记录的价格。最后一步是通过 _score 过滤记录,这可以通过另一个带有
script_score
参数的 min_score
来完成:
"script_score": {
"query": {
"bool": {
"should": [
{
.... default price calculation ....
},
{
.... adjusted price calculation ....
}
]
}
},
"script": {
"source": "if (_score >= params.min_price && _score <=params.max_price) { 1 } else { 0 }",
"params": {
"min_price": 100,
"max_price": 200
}
},
"min_score": 1
}
如果我们把这些放在一起,我们会得到这样的结果:
DELETE test
PUT test
{
"mappings": {
"properties": {
"default_fee": {
"type": "double"
},
"custom_dates": {
"type": "nested",
"properties": {
"start_date": {
"type": "date"
},
"price_adjustment": {
"type": "double"
}
}
}
}
}
}
PUT test/_doc/33952?refresh
{
"default_fee": 12,
"custom_dates": [
{
"start_date": "2023-11-01",
"price_adjustment": 88
},
{
"start_date": "2023-11-02",
"price_adjustment": 38
}
],
"options": [
{
"id": 95,
"cost": 5,
"type": [
"Car"
]
}
]
}
PUT test/_doc/33953?refresh
{
"default_fee": 24,
"custom_dates": [
{
"start_date": "2023-11-01",
"price_adjustment": 12
},
{
"start_date": "2023-11-02",
"price_adjustment": 1
}
],
"options": [
{
"id": 95,
"cost": 5,
"type": [
"Truck"
]
}
]
}
POST test/_search
{
"query": {
"script_score": {
"query": {
"bool": {
"should": [
{
"script_score": {
"script": {
"params": {
"total_stay": 3
},
"source": "doc['default_fee'].value * params.total_stay"
},
"query": {
"match_all": {}
}
}
},
{
"nested": {
"path": "custom_dates",
"query": {
"script_score": {
"script": {
"source": "doc['custom_dates.price_adjustment'].value"
},
"query": {
"range": {
"custom_dates.start_date": {
"gte": "2023-10-31",
"lte": "2023-11-02"
}
}
}
}
},
"score_mode": "sum"
}
}
]
}
},
"script": {
"source": "if (_score >= params.min_price && _score <=params.max_price) { 1 } else { 0 }",
"params": {
"min_price": 100,
"max_price": 200
}
},
"min_score": 1
}
}
}
这个有用吗?是的,在某种程度上。在elasticsarch中,分数是非负32位浮点数。所以,那里没有太多的精确度,如果你的调整是负的,事情就会变得更加复杂。
我会在生产中做这样的事情吗?我不会。我要做的是以某种易于解析的格式将特殊日期存储在主文档中,以便我可以在查询阶段访问它。然后在
script
查询和 script_field
中从主文档中解析它。是的,您需要解析它两次,但正如我在回答开头提到的那样,我们对此无能为力,因为这些操作是在不同阶段执行的。最简单的方法是将其存储为
多值关键字字段。基本上,你可以做这样的事情:
DELETE test
PUT test
{
"mappings": {
"properties": {
"default_fee": {
"type": "double"
},
"custom_dates": {
"type": "keyword"
}
}
}
}
PUT test/_doc/33952?refresh
{
"default_fee": 12,
"custom_dates": ["2023-11-01:100", "2023-11-02:150"],
"options": [
{
"id": 95,
"cost": 5,
"type": [
"Car"
]
}
]
}
PUT test/_doc/33953?refresh
{
"default_fee": 24,
"custom_dates": ["2023-11-01:12", "2023-11-02:1"],
"options": [
{
"id": 95,
"cost": 5,
"type": [
"Truck"
]
}
]
}
POST test/_search
{
"query": {
"script": {
"script": {
"source": """
DateTimeFormatter formatter = DateTimeFormatter.ofPattern('yyyy-MM-dd');
def from = LocalDate.parse(params.checkin, formatter);
def to = LocalDate.parse(params.checkout, formatter);
def stay = java.time.temporal.ChronoUnit.DAYS.between(from, to);
def custom_prices = [10];
if (doc.containsKey('custom_dates')) {
custom_prices = doc['custom_dates'].stream()
.map(date_price -> {
def date_price_parsed = date_price.splitOnToken(':');
def date = LocalDate.parse(date_price_parsed[0], formatter);
if (!date.isBefore(from) && !date.isAfter(to.minusDays(1))) {
return Double.parseDouble(date_price_parsed[1]);
} else {
return -1;
}
})
.filter(price -> {return price > 0;})
.collect(Collectors.toList());
}
def custom_price = custom_prices.sum();
def default_price = stay == custom_prices.size() ? 0 : (stay - custom_prices.size()) * doc['default_fee'].value;
def calc_price = default_price + custom_price;
return calc_price >= params.min_price && calc_price <= params.max_price;
""",
"params": {
"checkin": "2023-10-31",
"checkout": "2023-11-02",
"min_price": 100,
"max_price": 200
}
}
}
},
"script_fields": {
"total": {
"script": {
"source": """
DateTimeFormatter formatter = DateTimeFormatter.ofPattern('yyyy-MM-dd');
def from = LocalDate.parse(params.checkin, formatter);
def to = LocalDate.parse(params.checkout, formatter);
def stay = java.time.temporal.ChronoUnit.DAYS.between(from, to);
def custom_prices = [10];
if (doc.containsKey('custom_dates')) {
custom_prices = doc['custom_dates'].stream()
.map(date_price -> {
def date_price_parsed = date_price.splitOnToken(':');
def date = LocalDate.parse(date_price_parsed[0], formatter);
if (!date.isBefore(from) && !date.isAfter(to.minusDays(1))) {
return Double.parseDouble(date_price_parsed[1]);
} else {
return -1;
}
})
.filter(price -> {return price > 0;})
.collect(Collectors.toList());
}
def custom_price = custom_prices.sum();
def default_price = stay == custom_prices.size() ? 0 : (stay - custom_prices.size()) * doc['default_fee'].value;
def calc_price = default_price + custom_price;
return calc_price;
""",
"params": {
"checkin": "2023-10-31",
"checkout": "2023-11-02"
}
}
}
},
"_source": [
"*"
]
}