有记录限制的循环赛

问题描述 投票:0回答:1

我必须为以下需求构建一个 Hive SQL 查询

我有一张顾客桌。我需要将表总记录除以 6(即假设表每月包含 600 条记录,最多 6 个月包含 100 条记录),每个月都有一个括号来定位客户。假设括号限制为 4,那么我需要从 5 个唯一帐户中选择 5 个唯一电子邮件 ID。如果是 10,则来自 10 个唯一帐户的 10 个唯一电子邮件 ID

注意:我使用mod操作来分发6个月的记录。

账户 电子邮件 模组,6
ACC 1 电子邮件@acc1 1
acc2 电子邮件1@acc2 1
acc2 电子邮件2@acc2 2
acc2 电子邮件3@acc2 3
acc2 电子邮件4@acc2 4
acc2 电子邮件5@acc2 5
acc2 电子邮件6@acc2 6
acc2 电子邮件7@acc2 1
acc3 电子邮件1@acc3 1
acc3 电子邮件2@acc3 2
acc3 电子邮件3@acc3 3
acc4 电子邮件@acc4 1
acc5 电子邮件1@acc5 1
acc5 电子邮件2@acc5 2

预期输出 - 括号为 4(不需要以下输出 acc5,因为记录计数已达到括号范围 - 4)

账户 电子邮件 mod,6
ACC 1 电子邮件@acc1 1
acc2 电子邮件1@acc2 1
acc3 电子邮件1@acc3 1
acc4 电子邮件@acc4 1

如果括号是8(我必须先选择所有唯一帐户,然后选择其他顺序才能达到括号范围)

预期产量

账户 电子邮件 mod,6
ACC 1 电子邮件@acc1 1
acc2 电子邮件1@acc2 1
acc3 电子邮件1@acc3 1
acc4 电子邮件@acc4 1
acc5 电子邮件1@acc5 1
acc2 电子邮件7@acc2 1
acc2 电子邮件2@acc2 2
acc3 电子邮件2@acc3 2

如果括号是 10

账户 电子邮件 mod,6
ACC 1 电子邮件@acc1 1
acc2 电子邮件1@acc2 1
acc3 电子邮件1@acc3 1
acc4 电子邮件@acc4 1
acc5 电子邮件1@acc5 1
acc2 电子邮件7@acc2 1
acc2 电子邮件2@acc2 2
acc3 电子邮件2@acc3 2
acc5 电子邮件2@acc5 2
acc2 电子邮件3@acc2 3

我尝试了以下查询。但它首先获取所有 1 条记录。我不知道如何首先使用 mod_seq_value 1 获取唯一帐户记录,然后从 mod seq -1 开始剩余记录。

select * from (
select *, Row_number() over(order by mod_num_seq,acc_count) as rnk
select account,email,
count(*) over(partition by account) as acc_count
,case 
when mod(row_number() over(partition by account),6)=0 then 6
else mod(row_number() over(partition by account),6)=0 
end as mod_num_seq
from 
customer
)a
)b where rnk<={:bracket}
oracle sequence hql analytics round-robin
1个回答
0
投票

不确定为什么输出中的第 6 行和第 7 行是

email7@acc2, email2@acc2
而不是
email7@acc2, email2@acc3

with customer (account, email, x) as
(
select 'acc1','email@acc1', 1 from dual
union all select 'acc2','email1@acc2',  1 from dual
union all select 'acc2','email2@acc2',  2 from dual
union all select 'acc2','email3@acc2',  3 from dual
union all select 'acc2','email4@acc2',  4 from dual
union all select 'acc2','email5@acc2',  5 from dual
union all select 'acc2','email6@acc2',  6 from dual
union all select 'acc2','email7@acc2',  1 from dual
union all select 'acc3','email1@acc3',  1 from dual
union all select 'acc3','email2@acc3',  2 from dual
union all select 'acc3','email3@acc3',  3 from dual
union all select 'acc4','email@acc4',   1 from dual
union all select 'acc5','email1@acc5',  1 from dual
union all select 'acc5','email2@acc5',  2 from dual
)
, t as 
(
select c.*, nvl(nullif(mod(row_number() over (partition by account order by email),6),0),6) rn
from customer c
)
select t.*, row_number() over (partition by account order by rn, email) pick_up_order
from t
order by pick_up_order, account;

结果:

ACCOUNT EMAIL                X         RN PICK_UP_ORDER
------- ----------- ---------- ---------- -------------
acc1    email@acc1           1          1             1
acc2    email1@acc2          1          1             1
acc3    email1@acc3          1          1             1
acc4    email@acc4           1          1             1
acc5    email1@acc5          1          1             1
acc2    email7@acc2          1          1             2
acc3    email2@acc3          2          2             2
acc5    email2@acc5          2          2             2
acc2    email2@acc2          2          2             3
acc3    email3@acc3          3          3             3
acc2    email3@acc2          3          3             4
acc2    email4@acc2          4          4             5
acc2    email5@acc2          5          5             6
acc2    email6@acc2          6          6             7

14 rows selected.
© www.soinside.com 2019 - 2024. All rights reserved.