我在创建某些聚合时遇到问题。我在 Postgres 中使用这个表:
CREATE TABLE public.customer_courier_chat_messages (
sender_app_type character varying(255),
customer_id integer,
from_id integer,
to_id integer,
chat_started_by_message boolean,
order_id integer,
order_stage character varying(255),
courier_id integer,
message_sent_time timestamp with time zone
);
一些示例行:
INSERT INTO public.customer_courier_chat_messages(
sender_app_type, customer_id, from_id, to_id, chat_started_by_message, order_id, order_stage, courier_id, message_sent_time)
VALUES ('Customer IOS',99,99,21,FALSE,555,'PICKING_UP',21,timestamp '9/8/22 8:02'),
('Courier IOS',99,21,99,FALSE,555,'ARRIVING',21,timestamp '9/8/22 8:01'),
('Customer IOS',99,99,21,FALSE,555,'PICKING_UP',21,timestamp '9/8/22 8:00'),
('Courier Android',122,87,122,TRUE,38,'ADDRESS_DELIVERY',87,timestamp '9/8/22 7:55'),
('Customer Android',43,43,75,FALSE,875,'PICKING_UP',75,timestamp '7/8/22 14:55'),
('Courier Android',43,75,43,FALSE,875,'ARRIVING',75,timestamp '7/8/22 14:53'),
('Customer Android',43,43,75,FALSE,875,'PICKING_UP',75,timestamp '7/8/22 14:51'),
('Courier Android',43,75,43,TRUE,875,'ADDRESS_DELIVERY',75,timestamp '7/8/22 14:50'),
('Customer IOS',23,23,21,FALSE,134,'PICKING_UP',21,timestamp '7/8/22 10:02'),
('Courier IOS',23,21,23,FALSE,134,'ARRIVING',21,timestamp '7/8/22 10:01'),
('Customer IOS',23,23,21,FALSE,134,'PICKING_UP',21,timestamp '7/8/22 10:00');
我需要生成这些汇总结果:
这是我到目前为止所拥有的:
SELECT ccc.order_id,
ord.city_code,
string_agg(ccc.message_sent_time::character varying, ',' order by ccc.courier_id desc) as first_courier_message,
string_agg(ccc.message_sent_time::character varying, ',' order by ccc.customer_id asc) as first_customer_message,
count(ccc.courier_id) as num_messages_courier,
count(ccc.customer_id) as num_messages_customer,
string_agg(ccc.sender_app_type,' ' order by )
FROM customer_courier_chat_messages ccc
INNER JOIN "Orders" ord
ON ccc.order_id = ord.order_id
group by ccc.order_id, ord.city_code;
我的方向正确吗?
我该如何实现最后两项?
需要使用窗口函数:first_value()和last_value()
所以在你的情况下:
select ....
FIRST_VALUE(ccc.courier_id) OVER( ORDER BY ccc.message_sent_time) as first_courier_message,
FIRST_VALUE(ccc.customer_id) OVER( ORDER BY ccc.message_sent_time) as first_customer_message,
...
https://www.postgresqltutorial.com/postgresql-window-function/postgresql-first_value-function/ https://www.postgresql.org/docs/current/tutorial-window.html https://www.postgresql.org/docs/current/functions-window.html
你的意思是这样的吗?
WITH first_msg_by(first_msg_by) AS (
SELECT SPLIT_PART(sender_app_type,' ',1)
FROM public.customer_courier_chat_messages
ORDER BY message_sent_time
LIMIT 1
)
SELECT
MIN (CASE WHEN SPLIT_PART(sender_app_type,' ',1)='Courier' THEN message_sent_time END) AS first_courier_msg
, MIN (CASE WHEN SPLIT_PART(sender_app_type,' ',1)='Customer' THEN message_sent_time END) AS first_cust_msg
, COUNT(CASE WHEN SPLIT_PART(sender_app_type,' ',1)='Courier' THEN message_sent_time END) AS count_courier_msg
, COUNT(CASE WHEN SPLIT_PART(sender_app_type,' ',1)='Customer' THEN message_sent_time END) AS count_cust_msg
, MIN(first_msg_by) AS first_msg_by
, MIN(message_sent_time) AS conv_started_at
FROM public.customer_courier_chat_messages CROSS JOIN first_msg_by;
first_courier_msg | first_cust_msg | count_courier_msg | count_cust_msg | first_msg_by | conv_started_at |
---|---|---|---|---|---|
2022-07-08 10:01:00 | 2022-07-08 10:00:00 | 5 | 6 | 客户 | 2022-07-08 10:00:00 |