我不知道如何从这个SQL列类型中获取相关信息:
array<
struct<
day_of_week:string,
start:bigint,
duration:bigint,
enabled:boolean,
created_at:timestamp,
deleted_at:timestamp
>
>
此列包含有关数据库中餐馆每日营业时间的信息。有餐厅已经改变了我们每天的操作,因此我真的不需要SQL表中的一些行。所有需要的是所有餐厅目前的营业时间。
这是我尝试从以下内容获取信息的列的示例:
[
{
"day_of_week": "4",
"start": 64800000,
"duration": 359,
"enabled": false,
"created_at": "2018-02-23T10:47:15.033+0000",
"deleted_at": "2018-10-22T18:27:40.403+0000"
},
{
"day_of_week": "7",
"start": 64800000,
"duration": 359,
"enabled": true,
"created_at": "2018-10-22T18:29:11.030+0000",
"deleted_at": null
},
{
"day_of_week": "5",
"start": 64800000,
"duration": 359,
"enabled": true,
"created_at": "2018-10-22T18:29:11.030+0000",
"deleted_at": null
},
{
"day_of_week": "6",
"start": 64800000,
"duration": 359,
"enabled": false,
"created_at": "2018-10-22T18:27:40.397+0000",
"deleted_at": "2018-10-22T18:27:42.074+0000"
},
{
"day_of_week": "7",
"start": 64800000,
"duration": 359,
"enabled": false,
"created_at": "2018-10-22T18:27:40.397+0000",
"deleted_at": "2018-10-22T18:27:42.074+0000"
},
{
"day_of_week": "1",
"start": 64800000,
"duration": 359,
"enabled": false,
"created_at": "2018-10-22T18:27:42.069+0000",
"deleted_at": "2018-10-22T18:29:11.035+0000"
},
{
"day_of_week": "6",
"start": 64800000,
"duration": 359,
"enabled": true,
"created_at": "2018-10-22T18:29:11.030+0000",
"deleted_at": null
},
{
"day_of_week": "7",
"start": 64800000,
"duration": 359,
"enabled": false,
"created_at": "2018-10-22T18:27:42.069+0000",
"deleted_at": "2018-10-22T18:29:11.035+0000"
},
{
"day_of_week": "2",
"start": 64800000,
"duration": 359,
"enabled": false,
"created_at": "2018-02-23T10:47:15.033+0000",
"deleted_at": "2018-10-22T18:27:40.403+0000"
},
我对这条信息不感兴趣,因为它在2018-10-22被删除了:
[{"day_of_week":"4","start":64800000,"duration":359,"enabled":false,
"created_at":"2018-02-23T10:47:15.033+0000","deleted_at":"2018-10-22T18:27:40.403+0000"}
但是我对本专栏的所有部分感兴趣,因为它显示了day_of_week:7的运行时间。
"day_of_week":"7","start":64800000,"duration":359,"enabled":true,
"created_at":"2018-10-22T18:29:11.030+0000","deleted_at":null
我试过这个来获取列的所有元素,但它只返回第一个像单元格的内容,仅此而已:
LATERAL VIEW explode(shifts.`day_of_week`) exploded_table as day_of_week
LATERAL VIEW explode(shifts.`start`) exploded_table as start
LATERAL VIEW explode(shifts.`enabled`) exploded_table as enabled
LATERAL VIEW explode(shifts.`duration`) exploded_table as duration
有人可以帮我这个!!!
另外,我想"start":64800000
是指开放时间
和"duration":359
餐厅开放的持续时间。但我也很无法解释这些数字。我不知道"start":64800000
是否指的是早上7点,早上8点,上午9点?如果“持续时间”:359 7小时,9小时?
很抱歉这么长的帖子,但我是SQL的新手,在这里是我找到我无能为力的事情的唯一真实资源。
提前感谢您提供的任何帮助。
TLDR:
对于具有架构的数据帧df
:
key:integer
data:array
element:struct
day_of_week:string
start:decimal(38,0)
duration:decimal(38,0)
enabled:boolean
created_at:string
deleted_at:string
注册为临时表test
可以爆炸:
select key, a.ed.day_of_week,
a.ed.start, a.ed.duration,
a.ed.enabled, a.ed.created_at, a.ed.deleted_at
from (select key, explode(data) as ed from global_temp.test) a
where a.ed.deleted_at is null