我有一个关于城市看起来像这样之间的航班信息的表:
origin_city dest_city time
Dothan AL Atlanta GA 171
Dothan AL Atlanta GA 171
Dothan AL Elsewhere AL 2
Dothan AL Elsewhere AL 2
Dothan AL Elsewhere AL 2
Boston MA New York NY 5
Boston MA City MA 1
New York NY Boston MA 5
New York NY Boston MA 5
New York NY Boston MA 5
New York NY Poughkipsie NY 2
我想找到,每个起点城市,是不到3个小时长途飞行的百分比。所以结果会是这样:
Dothan AL 60
Boston MA 50
New York NY 25
我以为会工作的代码如下所示:
SELECT F.origin_city as origin_city,
((SELECT COUNT(*) FROM Flights as F2
WHERE F2.actual_time < 3) / (SELECT COUNT(*) FROM Flights as F3)) * 100
AS percentage
FROM Flights as F
GROUP BY F.origin_city
ORDER BY percentage;
GO
当我运行它,我得到原产地城市的列表和百分比列,符合市场预期,但该比例始终为0。我仍然很困惑子查询(因为你可以看到)。
我会用AVG()
作为窗口函数做到这一点:
SELECT F.origin_city as origin_city,
AVG( CASE WHEN F2.actual_time < 3 THEN 100.0 ELSE 0 END) as percentage
FROM Flights F
GROUP BY F.origin_city
ORDER BY percentage;
这假定该时间以小时测定。据谷歌地图,可以在68小时内从Dothan核心步行到亚特兰大,所以171是可疑的。
您的百分比超过了整个表,而不是由起点城市群。尝试是这样的:
SELECT F.origin_city as origin_city,
(SUM(CASE WHEN F.actual_time < 3 THEN 1 ELSE 0 END) / COUNT(*) ) * 100 AS percentage
FROM Flights as F
GROUP BY F.origin_city
ORDER BY percentage;
GO
FWIW与您当前的子查询的问题是,你有没有你的当前行和子查询中的数据之间的连接。你也许可以把它改写为这样的:
SELECT F.origin_city as origin_city,
((SELECT COUNT(*) FROM Flights as F2
WHERE F2.origin_city = F.origin_city and F2.actual_time < 3) / (SELECT COUNT(*) FROM Flights as F3 where F3.origin_city = F.origin_city)) * 100
AS percentage
FROM Flights as F
GROUP BY F.origin_city
ORDER BY percentage;
GO
但它是一种不必要的重新查询表时,你已经有足够的数据做你的计算,如上图所示的每一行。