累计非重复计数

问题描述 投票:0回答:7

我正在查询以获取每日 uid 的累计不同计数。

示例:假设有 2 个 uid (100,200) 出现在日期 2016-11-01,并且它们也在第二天出现,新的 uid 300 (100,200,300) 出现在 2016-11-02 此时我希望商店累积计数为 3 而不是 5,因为(用户 ID 100 和 200 已在过去一天出现)。

Input table:

    date            uid         
2016-11-01          100
2016-11-01          200
2016-11-01          300
2016-11-01          400         
2016-11-02          100
2016-11-02          200                 
2016-11-03          300
2016-11-03          400
2016-11-03          500
2016-11-03          600
2016-11-04          700

Expected query result:

date            daily_cumulative_count
2016-11-01              4   
2016-11-02              4
2016-11-03              6
2016-11-04              7

到目前为止,我能够获得每天的累计不同计数,但它也包括前一天的不同 uid。

SELECT 
  date, 
  SUM(count) OVER (
    ORDER BY date ASC 
    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
  )
FROM (
  SELECT 
    date, 
    COUNT(DISTINCT uid) AS count
  FROM sample_table
  GROUP by 1
)ORDER BY date DESC;

任何形式的帮助将不胜感激。

sql presto
7个回答
14
投票

最简单的方法:

SELECT *, count(*) over (order by fst_date ) cum_uids
  FROM (
SELECT uid, min(date) fst_date FROM t GROUP BY uid
 ) t

或者类似的东西


12
投票
WITH firstseen AS (
  SELECT uid, MIN(date) date
  FROM sample_table
  GROUP BY 1
)
SELECT DISTINCT date, COUNT(uid) OVER (ORDER BY date) daily_cumulative_count 
FROM firstseen
ORDER BY 1

使用

SELECT DISTINCT
因为
(date, COUNT(uid))
会重复很多次。

说明:对于每个日期

dt
,它会计算从最早日期到
dt
的uid,因为我们指定了
ORDER BY date
,并且默认为
BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW


8
投票

您可以使用

exists
检查之前的任何日期是否存在 ID。然后获取运行总和并找到每个组的最大值,这将为您提供每日不同的累积计数。

select dt, max(col) as daily_cumulative_count
from (select t1.*, 
      sum(case when not exists (select 1 from t where t1.dt > dt and id = t1.uid) then 1 else 0 end) over(order by dt) col
      from t t1) x 
group by dt

2
投票

请尝试以下操作...

SELECT date AS date
       COUNT( uid ) AS daily_cumulative_count
FROM ( SELECT leftTable.date AS date,
              rightTable.uid AS uid
       FROM sample_table AS leftTable
       JOIN sample_table AS rightTable ON leftTable.date >= rightTable.date
       GROUP BY leftTable.date,
                rightTable.uid
     ) AS allUIDSForDateFinder
GROUP BY date;

此语句首先将

sample_table
的一个实例连接到另一个实例,使得
leftTable
中的每条记录都与
rightTable
中具有较早或相等
date
值的每条记录的副本相关联。这有效地将一个列表附加到所有
date
值中的每个
uid
,直到并包括该
date
值。

通过使用

date
,将生成的数据集细化为独特的
uid
GROUP BY
组合。

来自子查询

allUIDSForDateFinder
的精炼数据集随后按查询主体按
date
进行分组,并执行与每个组关联的
COUNT()
值。

如果您有任何疑问或意见,请随时发表评论。


1
投票

版本:

uid

版本选择:

WITH t as ( SELECT uid , min(dt) fst_date FROM input_table GROUP BY uid ) SELECT DISTINCT fst_date , count(uid) over (order by fst_date ) daily_cumulative_count FROM t



0
投票

SELECT DISTINCT fst_date , count(uid) over (order by fst_date ) daily_cumulative_count FROM ( SELECT uid , min(dt) fst_date FROM input_table GROUP BY uid ) t

解决方案:

CREATE TABLE MyTable ( fecha VARCHAR(512), uid INT ); INSERT INTO MyTable (fecha, uid ) VALUES ('1/11/2016', '100'); INSERT INTO MyTable (fecha, uid ) VALUES ('1/11/2016', '200'); INSERT INTO MyTable (fecha, uid ) VALUES ('1/11/2016', '300'); INSERT INTO MyTable (fecha, uid ) VALUES ('1/11/2016', '400'); INSERT INTO MyTable (fecha, uid ) VALUES ('2/11/2016', '100'); INSERT INTO MyTable (fecha, uid ) VALUES ('2/11/2016', '200'); INSERT INTO MyTable (fecha, uid ) VALUES ('3/11/2016', '300'); INSERT INTO MyTable (fecha, uid ) VALUES ('3/11/2016', '400'); INSERT INTO MyTable (fecha, uid ) VALUES ('3/11/2016', '500'); INSERT INTO MyTable (fecha, uid ) VALUES ('3/11/2016', '600'); INSERT INTO MyTable (fecha, uid ) VALUES ('4/11/2016', '700'); INSERT INTO MyTable (fecha, uid ) VALUES ('5/11/2016', '700'); INSERT INTO MyTable (fecha, uid ) VALUES ('6/11/2016', '700'); INSERT INTO MyTable (fecha, uid ) VALUES ('7/11/2016', '700'); INSERT INTO MyTable (fecha, uid ) VALUES ('8/11/2016', '700'); INSERT INTO MyTable (fecha, uid ) VALUES ('8/11/2016', '900');

您可以在这里快速测试一下


0
投票

SELECT t1.fecha, COUNT(DISTINCT t2.uid) as daily_cumulative_count FROM MyTable t1 INNER JOIN MyTable t2 ON t1.fecha >= t2.fecha GROUP BY t1.fecha ORDER BY t1.fecha

© www.soinside.com 2019 - 2024. All rights reserved.