如何聚合组中的第一个值

问题描述 投票:0回答:2

我在创建某些聚合时遇到问题。我在 Postgres 中使用这个表:

CREATE TABLE public.customer_courier_chat_messages (
sender_app_type character varying(255),
customer_id integer,
from_id integer,
to_id integer,
chat_started_by_message boolean,
order_id integer,
order_stage character varying(255),
courier_id integer,
message_sent_time timestamp with time zone
);

一些示例行:

INSERT INTO public.customer_courier_chat_messages(
    sender_app_type, customer_id, from_id, to_id, chat_started_by_message, order_id, order_stage, courier_id, message_sent_time)
VALUES  ('Customer IOS',99,99,21,FALSE,555,'PICKING_UP',21,timestamp '9/8/22 8:02'),
        ('Courier IOS',99,21,99,FALSE,555,'ARRIVING',21,timestamp '9/8/22 8:01'),
        ('Customer IOS',99,99,21,FALSE,555,'PICKING_UP',21,timestamp '9/8/22 8:00'),
        ('Courier Android',122,87,122,TRUE,38,'ADDRESS_DELIVERY',87,timestamp '9/8/22 7:55'),
        ('Customer Android',43,43,75,FALSE,875,'PICKING_UP',75,timestamp '7/8/22 14:55'),
        ('Courier Android',43,75,43,FALSE,875,'ARRIVING',75,timestamp '7/8/22 14:53'),
        ('Customer Android',43,43,75,FALSE,875,'PICKING_UP',75,timestamp '7/8/22 14:51'),
        ('Courier Android',43,75,43,TRUE,875,'ADDRESS_DELIVERY',75,timestamp '7/8/22 14:50'),
        ('Customer IOS',23,23,21,FALSE,134,'PICKING_UP',21,timestamp '7/8/22 10:02'),
        ('Courier IOS',23,21,23,FALSE,134,'ARRIVING',21,timestamp '7/8/22 10:01'),
        ('Customer IOS',23,23,21,FALSE,134,'PICKING_UP',21,timestamp '7/8/22 10:00');

我需要生成这些汇总结果:

  • first_courier_message:第一条快递消息的时间戳
  • first_customer_message:第一条客户消息的时间戳
  • num_messages_courier: 快递发送的消息数量
  • num_messages_customer:客户发送的消息数
  • first_message_by: 第一个消息发送者(快递员或客户)
  • conversation_started_at: 对话中第一条消息的时间戳

这是我到目前为止所拥有的:

SELECT  ccc.order_id, 
    ord.city_code,
    string_agg(ccc.message_sent_time::character varying, ',' order by ccc.courier_id desc) as first_courier_message,
    string_agg(ccc.message_sent_time::character varying, ',' order by ccc.customer_id asc) as first_customer_message,
    count(ccc.courier_id) as num_messages_courier,  
    count(ccc.customer_id) as num_messages_customer,
    string_agg(ccc.sender_app_type,' ' order by )
FROM customer_courier_chat_messages ccc
INNER JOIN "Orders" ord
  ON ccc.order_id = ord.order_id
group by ccc.order_id, ord.city_code;

我的方向正确吗?
我该如何实现最后两项

sql postgresql postgresql-9.5
2个回答
1
投票

需要使用窗口函数:first_value()和last_value()

所以在你的情况下:

select ....
FIRST_VALUE(ccc.courier_id) OVER( ORDER BY ccc.message_sent_time) as first_courier_message,
FIRST_VALUE(ccc.customer_id) OVER( ORDER BY ccc.message_sent_time) as first_customer_message,
...

https://www.postgresqltutorial.com/postgresql-window-function/postgresql-first_value-function/ https://www.postgresql.org/docs/current/tutorial-window.html https://www.postgresql.org/docs/current/functions-window.html


0
投票

你的意思是这样的吗?

WITH first_msg_by(first_msg_by) AS (
  SELECT SPLIT_PART(sender_app_type,' ',1)
  FROM public.customer_courier_chat_messages
  ORDER BY message_sent_time
  LIMIT 1
)
SELECT
  MIN  (CASE WHEN SPLIT_PART(sender_app_type,' ',1)='Courier'  THEN message_sent_time END) AS first_courier_msg
, MIN  (CASE WHEN SPLIT_PART(sender_app_type,' ',1)='Customer' THEN message_sent_time END) AS first_cust_msg
, COUNT(CASE WHEN SPLIT_PART(sender_app_type,' ',1)='Courier'  THEN message_sent_time END) AS count_courier_msg
, COUNT(CASE WHEN SPLIT_PART(sender_app_type,' ',1)='Customer' THEN message_sent_time END) AS count_cust_msg
, MIN(first_msg_by)                                                                        AS first_msg_by
, MIN(message_sent_time)                                                                   AS conv_started_at
FROM public.customer_courier_chat_messages CROSS JOIN first_msg_by;
first_courier_msg first_cust_msg count_courier_msg count_cust_msg first_msg_by conv_started_at
2022-07-08 10:01:00 2022-07-08 10:00:00 5 6 客户 2022-07-08 10:00:00
© www.soinside.com 2019 - 2024. All rights reserved.