我想输出一个平面数据表(对于 csv),其中我使用字段 predictionModels.modelID 作为标题和 predictionModels.predictions.passengers 的相应值,当 predictionModels.predictions.daysAhead = 1 时。模型的数量是可变的但有限,所以我很乐意将它们硬编码到查询中,但不能只是 UNNEST 并输出结果。
我不愿意创建索引,因为它是一个正在生产中的大型数据存储,而且我是办公室里的初级人员,实际上是在玩游戏以更好地了解我们的数据,我担心做任何持久的事情或者可能会影响其他地方的资源可用性。我一直在尝试这样的事情-
SELECT departureDateTime, lineID, weight,
CASE WHEN predictionModels.modelID = "model1" AND predictionModels.predictions.daysAhead=1 THEN predictionModels.predictions.passengers END AS Model1,
CASE WHEN predictionModels.modelID = "model2" AND predictionModels.predictions.daysAhead=1 THEN predictionModels.predictions.passengers END AS Model2
FROM bucket
样本数据:
'departureDateTime': '2022-12-23T00:10:00+00:00',
'lineID': 'f2b4d1d9',
'weight': '630'
'predictionModels': [
{
"modelID":"model1",
"predictions": [
{"daysAhead":1, "passengers":11},
{"daysAhead":2, "passengers":12},
{"daysAhead":3, "passengers":13},
{"daysAhead":4, "passengers":14}
]
},
{
"modelID":"model2",
"predictions": [
{"daysAhead":1, "passengers":21},
{"daysAhead":2, "passengers":22},
{"daysAhead":3, "passengers":23},
{"daysAhead":4, "passengers":24}
]
}
]
},
{
'departureDateTime': '2023-01-24T00:17:00+00:00',
'lineID': 'f2b4d1d9',
'weight': '520'
'predictionModels': [
{
"modelID":"model2",
"predictions": [
{"daysAhead":1, "passengers":210},
{"daysAhead":2, "passengers":220},
{"daysAhead":3, "passengers":230},
{"daysAhead":4, "passengers":240}
]
}
]
}
期望的输出:
departureDateTime | lineID | weight | model1 | model2
---------------------------------------------------------------
2022-12-23T00:10:29+00:00 | f2b4d1d9 | 630 | 11 | 21
2023-01-24T00:17:29+00:00 | f2b4d1d9 | 520 | Nan | 210