如何为每个分区创建一个平均值,该分区最多包含5个与时间相关的成员?

问题描述 投票:0回答:2

我的目标是,仅在满足左连接标准的另一个表中选择最多5条记录的平均值。假设我们有表一(左)和记录:

RECNUM   ID    DATE         JOB
1      | cat | 2019.01.01 | meow
2      | dog | 2019.01.01 | bark

现在我们有表二(右)带有记录:

RECNUM   ID    Action_ID    DATE         REWARD
1      | cat | 1          | 2019.01.02 | 20
2      | cat | 99         | 2018.12.30 | 1
3      | cat | 23         | 2019.12.28 | 20       
4      | cat | 54         | 2018.01.01 | 20
5      | cat | 32         | 2018.01.02 | 20
6      | cat | 21         | 2018.01.03 | 20
7      | cat | 43         | 2018.12.28 | 1
8      | cat | 65         | 2018.12.29 | 1
9      | cat | 87         | 2018.09.12 | 1
10     | cat | 98         | 2018.10.11 | 1 
11     | dog | 56         | 2018.09.01 | 99 
12     | dog | 42         | 2019.09.02 | 99 

结果应返回:

ID  | AVG(Reward_from_latest_5_jobs)
cat | 1

现在满足的标准应该是:对于left table中的每个作业,请尝试为right table中的相同ID查找5个最新但较旧的唯一Action_ID,并计算它们的平均值。因此,换句话说,工作很艰难,我们不知道该给他什么报酬,我们试图计算他最近获得的五次报酬的平均值。如果发现少于5个,则不返回任何内容/输入null,如果更多,则丢弃最旧的。

我想要这样做的方式就像:

         SELECT a."ID", COUNT(b."Action_ID"), AVG(b."REWARD")  
         FROM 
             ( 
                SELECT "ID", "DATE"
                 FROM :left_table
             ) a  

              LEFT JOIN

             ( 
                SELECT "ID", "Action_ID", "DATE", "REWARD"
                 FROM :right_table
             ) b 

             ON(
                    a."ID" = b."ID" 
               )    
         WHERE a."DATE" > b."DATE" 
         GROUP BY a."ID"
         HAVING COUNT(b."Action_ID") >= 5;

但是,它将为所有符合条件的Action_ID帽子计算,而不仅仅是五个最新的。你能告诉我如何达到预期的结果吗?我可以使用子表,而不必在一个SQL语句中完成。此用例不允许使用任何过程。任何输入,高度赞赏。

sql select sql-scripts data-partitioning hana-sql-script
2个回答
0
投票

您可以使用窗口函数,然后进行聚合:

select 
    id,
    avg(reward) avg_reward
from (
    select 
        t1.id, 
        t2.reward, 
        count(*) over(partition by t1.id) cnt,
        rank() over(partition by t1.id order by t2.date desc) rn
    from leftable t1
    inner join righttable t2 on t1.id = t2.id and t2.date >= t1.date
) t
where cnt >= 5 and rn <= 5
group by id

0
投票

使用窗口函数获得前五名:

select id, avg(reward)
from (select r.*,
             row_number() over (partition by l.id order by r.date desc) as seqnum
      from table1 l join
           table2 r
           on l.id = r.id and l.date > r.date
     ) r
group by id
having count(*) >= 5;
© www.soinside.com 2019 - 2024. All rights reserved.