如何从字符串值字段获取运行计数

问题描述 投票:0回答:1

我有一个需要操作的表来进行滚动计数。我遇到的问题是让它不断聚合。

id      |sent|read_at|created_at             |sent_at                |direction|actor_type|
--------+----+-------+-----------------------+-----------------------+---------+----------+
TA-12345|   1|       |2022-12-28 13:45:54.000|2022-12-28 13:45:54.000|OUTBOUND |nurse     |
TA-12345|   1|       |2022-12-28 14:38:47.000|2022-12-28 14:38:47.000|OUTBOUND |nurse     |
TA-12345|   1|       |2022-12-28 13:47:01.000|2022-12-28 13:47:01.000|OUTBOUND |nurse     |
TA-12345|   1|       |2022-12-28 13:18:58.000|2022-12-28 13:18:58.000|OUTBOUND |nurse     |
TA-12345|   1|       |2022-12-28 14:38:51.000|2022-12-28 14:38:51.000|OUTBOUND |nurse     |
TA-12345|   1|       |2022-12-28 13:52:40.000|2022-12-28 13:52:40.000|OUTBOUND |nurse     |
TA-12345|   1|       |2022-12-28 15:06:10.000|2022-12-28 15:06:10.000|OUTBOUND |nurse     |
TA-12345|   1|       |2022-12-28 14:41:43.000|2022-12-28 14:41:43.000|INBOUND  |patient   |
TA-12345|   1|       |2022-12-28 13:49:11.000|2022-12-28 13:49:11.000|INBOUND  |patient   |
TA-12345|   1|       |2022-12-28 13:19:35.000|2022-12-28 13:19:35.000|INBOUND  |patient   |
TA-12345|   1|       |2022-12-28 12:58:26.000|2022-12-28 12:58:26.000|INBOUND  |patient   |
TA-12345|   1|       |2022-12-28 13:19:48.000|2022-12-28 13:19:48.000|INBOUND  |patient   |
TA-12345|   1|       |2022-12-28 13:49:14.000|2022-12-28 13:49:14.000|INBOUND  |patient   |
TA-12345|   1|       |2022-12-28 13:51:23.000|2022-12-28 13:51:23.000|INBOUND  |patient   |
TA-12345|   1|       |2022-12-28 13:47:52.000|2022-12-28 13:47:52.000|INBOUND  |patient   |

我的代码如下:

select 
         id
        , sent      
        , read_at
        , created_at 
        , sent_at
        , direction
        , CASE
                WHEN direction = 'OUTBOUND' THEN 'nurse'
                WHEN direction = 'INBOUND' THEN 'patient'
            END as actor_type
    from `projectid`.warehouse.datasetid
        where id = 'TA-12345'

我确实使用了 row_number() over(partition by),但我不确定如果我要按

created_at
id

分区,这是否会给我准确的滚动计数

例如:

select 
        id
        , sent      
        , read_at
        , created_at 
        , sent_at
        , direction
        , CASE
                WHEN direction = 'OUTBOUND' THEN 'nurse'
                WHEN direction = 'INBOUND' THEN 'patient'
            END as actor_type
        , row_number() over(PARTITION by direction 
            order by created_at ) as running_count
    from `projectid`.warehouse.datasetid
        where id = 'TA-12345'

这给了我这个:

id      |sent|read_at|created_at             |sent_at                |direction|actor_type|running_count|
--------+----+-------+-----------------------+-----------------------+---------+----------+-------------+
TA-12345|   1|       |2022-12-28 12:58:26.000|2022-12-28 12:58:26.000|INBOUND  |patient   |            1|
TA-12345|   1|       |2022-12-28 13:19:35.000|2022-12-28 13:19:35.000|INBOUND  |patient   |            2|
TA-12345|   1|       |2022-12-28 13:19:48.000|2022-12-28 13:19:48.000|INBOUND  |patient   |            3|
TA-12345|   1|       |2022-12-28 13:47:52.000|2022-12-28 13:47:52.000|INBOUND  |patient   |            4|
TA-12345|   1|       |2022-12-28 13:49:11.000|2022-12-28 13:49:11.000|INBOUND  |patient   |            5|
TA-12345|   1|       |2022-12-28 13:49:14.000|2022-12-28 13:49:14.000|INBOUND  |patient   |            6|
TA-12345|   1|       |2022-12-28 13:51:23.000|2022-12-28 13:51:23.000|INBOUND  |patient   |            7|
TA-12345|   1|       |2022-12-28 14:41:43.000|2022-12-28 14:41:43.000|INBOUND  |patient   |            8|
TA-12345|   1|       |2022-12-28 13:18:58.000|2022-12-28 13:18:58.000|OUTBOUND |nurse     |            1|
TA-12345|   1|       |2022-12-28 13:45:54.000|2022-12-28 13:45:54.000|OUTBOUND |nurse     |            2|
TA-12345|   1|       |2022-12-28 13:47:01.000|2022-12-28 13:47:01.000|OUTBOUND |nurse     |            3|
TA-12345|   1|       |2022-12-28 13:52:40.000|2022-12-28 13:52:40.000|OUTBOUND |nurse     |            4|
TA-12345|   1|       |2022-12-28 14:38:47.000|2022-12-28 14:38:47.000|OUTBOUND |nurse     |            5|
TA-12345|   1|       |2022-12-28 14:38:51.000|2022-12-28 14:38:51.000|OUTBOUND |nurse     |            6|
TA-12345|   1|       |2022-12-28 15:06:10.000|2022-12-28 15:06:10.000|OUTBOUND |nurse     |            7|

我尝试了

sum(sent) over (order by direction)
,但这给了我两行,结果都是一行。我也尝试过
SUM(IF(actor_type ="patient", sent, NULL)) AS patient_message_count,
但是出错了。 我怎样才能获得更好的运行计数方法?

我想要的结果应该是这样的:

id       |sent|read_at|created_at             |sent_at                |direction|actor_type|running_count|
---------+----+-------+-----------------------+-----------------------+---------+----------+-------------+
TA-100000|   1|       |2023-03-20 09:12:41.000|2023-03-20 09:12:41.000|INBOUND  |patient   |            1|
TA-100001|   1|       |2023-03-13 23:45:34.000|2023-03-13 23:45:34.000|INBOUND  |patient   |            1|
TA-100009|   1|       |2023-03-01 10:06:26.000|2023-03-01 10:06:26.000|INBOUND  |patient   |            1|
TA-100011|   1|       |2023-03-06 17:36:21.000|2023-03-06 17:36:21.000|INBOUND  |patient   |            1|
TA-100011|   1|       |2023-03-07 11:36:25.000|2023-03-07 11:36:25.000|INBOUND  |patient   |            2|
TA-100011|   1|       |2023-03-21 12:02:31.000|2023-03-21 12:02:31.000|INBOUND  |patient   |            3|
TA-100014|   1|       |2023-03-17 07:47:11.000|2023-03-17 07:47:11.000|INBOUND  |patient   |            1|
TA-100014|   1|       |2023-03-17 07:47:23.000|2023-03-17 07:47:23.000|INBOUND  |patient   |            2|
TA-100014|   1|       |2023-03-17 07:47:40.000|2023-03-17 07:47:40.000|INBOUND  |patient   |            3|
TA-100014|   1|       |2023-03-17 14:12:46.000|2023-03-17 14:12:46.000|INBOUND  |patient   |            4|
TA-100016|   1|       |2023-03-02 11:10:50.000|2023-03-02 11:10:50.000|INBOUND  |patient   |            1|
TA-100017|   1|       |2023-03-03 12:13:03.000|2023-03-03 12:13:03.000|INBOUND  |patient   |            1|
TA-100019|   1|       |2023-03-31 10:40:19.000|2023-03-31 10:40:19.000|INBOUND  |patient   |            1|
TA-100020|   1|       |2023-03-07 12:32:23.000|2023-03-07 12:32:23.000|INBOUND  |patient   |            1|
TA-100021|   1|       |2023-03-03 11:02:17.000|2023-03-03 11:02:17.000|INBOUND  |patient   |            1|
TA-100024|   1|       |2023-04-03 18:45:19.000|2023-04-03 18:45:19.000|INBOUND  |patient   |            1|
TA-100024|   1|       |2023-04-03 18:56:57.000|2023-04-03 18:56:57.000|INBOUND  |patient   |            2|
TA-100024|   1|       |2023-04-03 18:57:10.000|2023-04-03 18:57:10.000|INBOUND  |patient   |            3|
TA-100024|   1|       |2023-04-04 08:36:56.000|2023-04-04 08:36:56.000|INBOUND  |patient   |            4|
TA-100024|   1|       |2023-04-19 14:57:00.000|2023-04-19 14:57:00.000|INBOUND  |patient   |            5|

当按 id 和方向分区时,我得到:

id       |sent|read_at|created_at             |sent_at                |direction|actor_type|running_count|
---------+----+-------+-----------------------+-----------------------+---------+----------+-------------+
TA-100000|   1|       |2023-04-05 15:00:44.000|2023-04-05 15:00:44.000|OUTBOUND |nurse     |            1|
TA-100000|   1|       |2023-03-20 09:12:41.000|2023-03-20 09:12:41.000|INBOUND  |patient   |            1|
TA-100000|   1|       |2023-03-20 08:50:18.000|2023-03-20 08:50:18.000|OUTBOUND |nurse     |            1|
TA-100000|   1|       |2023-03-20 09:13:42.000|2023-03-20 09:13:42.000|OUTBOUND |nurse     |            1|
TA-100001|   1|       |2023-03-13 23:45:34.000|2023-03-13 23:45:34.000|INBOUND  |patient   |            1|
TA-100001|   1|       |2023-03-14 10:31:31.000|2023-03-14 10:31:31.000|OUTBOUND |nurse     |            1|
TA-100001|   1|       |2023-03-13 13:39:44.000|2023-03-13 13:39:44.000|OUTBOUND |nurse     |            1|
TA-100001|   1|       |2023-03-01 13:11:30.000|2023-03-01 13:11:30.000|OUTBOUND |nurse     |            1|

预期:

id       |sent|read_at|created_at             |sent_at                |direction|actor_type|running_count|
---------+----+-------+-----------------------+-----------------------+---------+----------+-------------+
TA-100000|   1|       |2023-04-05 15:00:44.000|2023-04-05 15:00:44.000|OUTBOUND |nurse     |            1|
TA-100000|   1|       |2023-03-20 09:12:41.000|2023-03-20 09:12:41.000|INBOUND  |patient   |            1|
TA-100000|   1|       |2023-03-20 08:50:18.000|2023-03-20 08:50:18.000|OUTBOUND |nurse     |            2|
TA-100000|   1|       |2023-03-20 09:13:42.000|2023-03-20 09:13:42.000|OUTBOUND |nurse     |            3|
TA-100001|   1|       |2023-03-13 23:45:34.000|2023-03-13 23:45:34.000|INBOUND  |patient   |            1|
TA-100001|   1|       |2023-03-14 10:31:31.000|2023-03-14 10:31:31.000|OUTBOUND |nurse     |            1|
TA-100001|   1|       |2023-03-13 13:39:44.000|2023-03-13 13:39:44.000|OUTBOUND |nurse     |            2|
TA-100001|   1|       |2023-03-01 13:11:30.000|2023-03-01 13:11:30.000|OUTBOUND |nurse     |            3|
sql google-bigquery counter counting
1个回答
0
投票

我试过了 row_number() over(PARTITION by id,direction order by id,direction ) as running_count

CREATE TABLE datasetid 
(
    id  VARCHAR(512),
    sent    VARCHAR(512),
    read_at VARCHAR(512),
    created_at  VARCHAR(512),
    sent_at VARCHAR(512),
    direction   VARCHAR(512),
    actor_type  VARCHAR(512)
);

INSERT INTO datasetid (id, sent, read_at, created_at, sent_at, direction, actor_type) VALUES
    ('TA-100000', '1', '', '2023-04-05 15:00:44.000', '2023-04-05 15:00:44.000', 'OUTBOUND', 'nurse'),
    ('TA-100000', '1', '', '2023-03-20 09:12:41.000', '2023-03-20 09:12:41.000', 'INBOUND', 'patient'),
    ('TA-100000', '1', '', '2023-03-20 08:50:18.000', '2023-03-20 08:50:18.000', 'OUTBOUND', 'nurse'),
    ('TA-100000', '1', '', '2023-03-20 09:13:42.000', '2023-03-20 09:13:42.000', 'OUTBOUND', 'nurse'),
    ('TA-100001', '1', '', '2023-03-13 23:45:34.000', '2023-03-13 23:45:34.000', 'INBOUND', 'patient'),
    ('TA-100001', '1', '', '2023-03-14 10:31:31.000', '2023-03-14 10:31:31.000', 'OUTBOUND', 'nurse'),
    ('TA-100001', '1', '', '2023-03-13 13:39:44.000', '2023-03-13 13:39:44.000', 'OUTBOUND', 'nurse'),
    ('TA-100001', '1', '', '2023-03-01 13:11:30.000', '2023-03-01 13:11:30.000', 'OUTBOUND', 'nurse');



SELECT *,
  row_number() over(PARTITION by id,direction 
            order by id,direction ) as running_count
FROM datasetid
id 已发送 阅读于 创建于 发送于 方向 演员类型 运行计数
TA-100000 1 2023-03-20 09:12:41.000 2023-03-20 09:12:41.000 入境 病人 1
TA-100000 1 2023-03-20 08:50:18.000 2023-03-20 08:50:18.000 出境 护士 1
TA-100000 1 2023-03-20 09:13:42.000 2023-03-20 09:13:42.000 出境 护士 2
TA-100000 1 2023-04-05 15:00:44.000 2023-04-05 15:00:44.000 出境 护士 3
TA-100001 1 2023-03-13 23:45:34.000 2023-03-13 23:45:34.000 入境 病人 1
TA-100001 1 2023-03-14 10:31:31.000 2023-03-14 10:31:31.000 出境 护士 1
TA-100001 1 2023-03-13 13:39:44.000 2023-03-13 13:39:44.000 出境 护士 2
TA-100001 1 2023-03-01 13:11:30.000 2023-03-01 13:11:30.000 出境 护士 3
© www.soinside.com 2019 - 2024. All rights reserved.