如何有效地计算类别属性的类别(250)? PostgreSQL或Python

问题描述 投票:0回答:1

我有一个具有50个属性(8个分类)的大数据库,我需要创建一个摘要,其中包含按城市和州分组的每个变量的所有类别的计数。属性之一具有超过250个类别。

到目前为止,我已经能够创建一个查询,该查询在按城市分组并导出到CSV时对每个属性计算一个类别。

(select city as "City", COUNT(use4) as "use2056"
from demo
where use4 = '2056'
group by city
order by city asc)

我正在考虑手动复制和粘贴(我知道这将永远做),但是我得到的输出带有不同的行。此外,有些城市与美国同名(我最终将其形象化)。我尝试对每个查询使用多个SELECT,但无法使其正常工作。


Select
(select city as "City", COUNT(use4) as "use2056"
from demo
where use4 = '2056'
group by city
order by city asc),
(COUNT(use4) as "use2436"
from demo
where use4 = '2436'
group by city
order by city asc),
(COUNT(use4) as "use9133"
from demo
where use4 = '9133'
group by city
order by city asc)

我还尝试添加城市和县以及其他计数

(select zip as "ZIPCODE", city, county, COUNT(use4) as "Use4count1466", COUNT(use4) as "Use4count9133"
from demo
where use4 = '1466',
where use4 = '9133' 
group by zip, city, county
order by zip asc)

反正有效率地做到这一点吗?创建一个循环,继续计算每个属性的每个类别?一个查询中可以有几个SELECT?我需要找到一种显示邮政编码,县,城市并计算每个分类属性的所有类别的方法。

postgresql count categorical-data
1个回答
0
投票

您可以使用过滤的聚合在单个查询中执行此操作:

select city, 
       count(*) filter (where use4 = '2056') as use2056,
       count(*) filter (where use4 = '2436') as use2436,
       count(*) filter (where use4 = '9133') as use9133,
from demo
where use4 in ('2056', '2436', '9133')
group by city;

您可以将其应用于第二个查询:

select zip as "ZIPCODE", 
       city, 
       county, 
       count(*) filter (where use4 = '1466') as use4count1466, 
       count(*) filter (where use4 = '9133') as use4count9133
from demo
where use4 in ('1466','9133')
group by zip, city, county
© www.soinside.com 2019 - 2024. All rights reserved.