我有一个带窗口函数的公用表表达式并不断收到错误消息:
编译语句时出错:FAILED:SemanticException无法将窗口调用分解为组。至少有一个组必须仅依赖于输入列。还要检查循环依赖性。基础错误:org.apache.hadoop.hive.ql.parse.SemanticException:第82行:6 CTE pro_orders定义中的列引用'gcr_amt'无效[选择o.shopper_id为pro_shopper_id,date_format(o.order_date,'YYYYMM')作为ym_order,sum(o.gcr_amt)为total_gcr,sum(o.product_pnl_new_renewal_name ='New Purchase',然后o.gcr_amt结束时的情况)为new_gcr,sum(o.gcr_amt)over(o之前的o.shopper_id行之间的行)来自dp_enterprise.uds_order的12months_direct_gcr以及cs.pro_shopper_id = o.shopper_id和cs.year_month = date_format(o.order_date,'YYYYMM')的内部连接combined_shopper_level_data cs,其中o.exclude_reason_desc是o.shopper_id的Null组, o.order_date]用作第83行:5的po
我的cte看起来像这样:
pro_orders as (
select o.shopper_id as pro_shopper_id,
date_format(o.order_date, 'YYYYMM') as ym_order,
sum(o.gcr_amt) as total_gcr,
sum(case when o.product_pnl_new_renewal_name = 'New Purchase' then o.gcr_amt end) as new_gcr,
sum(o.gcr_amt) over (partition by o.shopper_id, cs.year_month order by cs.year_month desc rows between 12 preceding and 0 following) as 12months_direct_gcr
from dp_enterprise.uds_order o
right join combined_shopper_level_data cs on cs.pro_shopper_id = o.shopper_id and cs.year_month = date_format(o.order_date, 'YYYYMM')
group by o.shopper_id, o.order_date
),
我不经常使用窗口函数,也许我的语法是关闭的。在英语中,我要做的是获得12个月的公制“gcr”总计。
因此,在201901年的一年中,有一个shopper_id 123abc的行,我想将前11个月加上当前行gcr的总和为12个月。不确定我的窗口功能是否正确设置?
引用的year_month格式为YYYYMM,例如: 201901。
根据我的目标,我的窗口功能是否设置正确?
我该如何克服此错误消息?
编辑:仍然使用以下CTE收到此错误消息:
pro_orders as (
select o.shopper_id as pro_shopper_id,
cs.year_month,
sum(case when date_format(o.order_date, 'YYYYMM') = cs.year_month then o.gcr_amt else 0 end) as total_gcr,
sum(case when date_format(o.order_date, 'YYYYMM') = cs.year_month and o.product_pnl_new_renewal_name = 'New Purchase' then o.gcr_amt else 0 end) as new_gcr,
sum(sum(o.gcr_amt)) over (partition by o.shopper_id
order by cs.year_month desc
rows between 12 preceding and 0 following)
as 12months_direct_gcr
from combined_shopper_level_data cs
left join dp_enterprise.uds_order o on o.shopper_id = cs.pro_shopper_id
where o.exclude_reason_desc is Null
group by o.shopper_id, cs.year_month
),
结果出现类似的错误消息:
编译语句时出错:FAILED:SemanticException无法将窗口调用分解为组。至少有一个组必须仅依赖于输入列。还要检查循环依赖性。基础错误:org.apache.hadoop.hive.ql.parse.SemanticException:第83:10行CTE pro_orders定义中的列引用'gcr_amt'无效[选择o.shopper_id为pro_shopper_id,cs.year_month,sum(date_format的情况) o.order_date,'YYYYMM')= cs.year_month然后o.gcr_amt else 0 end)as total_gcr,sum(date_format(o.order_date,'YYYYMM')= cs.year_month和o.product_pnl_new_renewal_name ='New Purchase'的情况然后o.gcr_amt其他0结束)作为new_gcr,sum(sum(o.gcr_amt))over(由o.shopper_id顺序划分cs.year_month desc行在12前面和后面的0之间)作为12months_direct_gcr来自combined_shopper_level_data cs left join dp_enterprise。 o.shopper_id = cs.pro_shopper_id上的uds_order o其中o.exclude_reason_desc是o.shopper_id的空组,cs.year_month]用作87号线的po:5
你有一个聚合查询,所以窗口函数看起来有点搞笑。基本想法是这样的:
sum(sum(o.gcr_amt)) over (partition by o.shopper_id, cs.year_month
order by cs.year_month desc
rows between 12 preceding and 0 following
) as 12months_direct_gcr
这仍然行不通。首先,你有order by
和partition by
的价值。其次,它不在group by
。
假设每个月都有一个值,那么您可以使用:
sum(sum(o.gcr_amt)) over (partition by o.shopper_id
order by cs.year_month desc
rows between 12 preceding and 0 following
) as 12months_direct_gcr
并在cs.year_month
中使用group by
(可能需要调整查询的其他部分。
为了便于阅读,我还建议您使用left join
而不是right join
。对于我(以及大多数人)来说,认真地说“在我刚刚阅读的第一个表中保留所有行”而不是“将所有行保留在from
末尾的某些表格中”条款”。
编辑:
我认为完整的查询是:
with pro_orders as (
select o.shopper_id as pro_shopper_id,
cs.year_month,
sum(coalesce(o.gcr_amt, 0)) as total_gcr,
sum(case when o.product_pnl_new_renewal_name = 'New Purchase' then o.gcr_amt else 0 end) as new_gcr,
sum(sum(o.gcr_amt)) over (partition by o.shopper_id
order by cs.year_month desc
rows between 12 preceding and 0 following
) as 12months_direct_gcr
from combined_shopper_level_data cs left join
dp_enterprise.uds_order o
on o.shopper_id = cs.pro_shopper_id and
date_format(o.order_date, 'YYYYMM') = cs.year_month and
o.exclude_reason_desc is Null
group by o.shopper_id, cs.year_month
),
在聚合查询中使用窗口函数时,Hive可能存在限制(这会让我感到惊讶,因为这些是单独处理的)。我找不到具体的参考。如果是这样,只需使用子查询:
with pro_orders as (
select pro_shopper_id, year_month, total_gcr, new_gcr
sum(sum(total_gcr_amt)) over (partition by pro_shopper_id
order by year_month desc
rows between 12 preceding and 0 following
) as 12months_direct_gcr
from (select o.shopper_id as pro_shopper_id,
cs.year_month,
sum(coalesce(o.gcr_amt, 0)) as total_gcr,
sum(case when o.product_pnl_new_renewal_name = 'New Purchase' then o.gcr_amt else 0 end) as new_gcr,
from combined_shopper_level_data cs left join
dp_enterprise.uds_order o
on o.shopper_id = cs.pro_shopper_id and
date_format(o.order_date, 'YYYYMM') = cs.year_month and
o.exclude_reason_desc is Null
group by o.shopper_id, cs.year_month
) ps
),