我正在尝试将数据连接在一起,我的目标是拥有卖家列表,并在每个卖家旁边查看他出售商品所花费的中位数时间。
以下查询在单个卖家 (dan) 上按预期工作,并返回一个值:
SELECT
AVG(timetook) AS Median
FROM (
SELECT
TIMESTAMPDIFF(MINUTE, starttime, endtime) AS timetook,
@rownum:=@rownum+1 as `row_number`,
@total_rows:=@rownum
FROM items, (SELECT @rownum:=0) r
WHERE total_price > 100 AND seller_name = "dan"
ORDER BY timetook ASC
) AS temp
WHERE
`row_number` = FLOOR((@total_rows + 1) / 2) OR
`row_number` = CEIL((@total_rows + 1) / 2)
但是,当试图将其视为卖家列表时(为了测试,我只对两个卖家进行测试),我得到了卖家中位数的空值。
SELECT
q1.seller_name,
q1.records_found,
q2.median_timetook
FROM (
SELECT
seller_name,
COUNT(*) AS records_found
FROM items
WHERE
total_price > 100 AND
TIMESTAMPDIFF(MINUTE, starttime, endtime) < 60
GROUP BY seller_name
HAVING COUNT(*) >= 3
) AS q1
LEFT JOIN (
SELECT
seller_name,
AVG(timetook) AS median_timetook
FROM (
SELECT
seller_name,
TIMESTAMPDIFF(MINUTE, starttime, endtime) AS timetook,
@rownum:=@rownum+1 AS row_number,
@total_rows:=@rownum
FROM
items,
(SELECT @rownum:=0) r
ORDER BY
timetook ASC
) AS temp
WHERE
row_number = FLOOR((@total_rows + 1) / 2) OR
row_number = CEIL((@total_rows + 1) / 2)
GROUP BY
seller_name
) AS q2 ON q1.seller_name = q2.seller_name
WHERE
q1.seller_name IN ('ron', 'dan')
GROUP BY
q1.seller_name
ORDER BY
q1.records_found DESC
我建议使用窗口函数而不是用户变量;后者已被弃用,并计划在未来的 MySQL 版本中删除。窗口函数更易于使用,尤其是在管理分区时。
这是计算每个卖家中位数的一种方法:
select seller_name, avg(duration) as median_duration
from (
select i.*,
row_number() over(partition by seller_name order by duration) rn,
count(*) over(partition by seller_name) cnt
from (
select i.*,
timestampdiff(minute, starttime, endtime)) duration
from items i
) i
) i
where rn in ( floor((cnt + 1) / 2), floor( (cnt + 2) / 2) )
group by seller_name