我在 Amazon QuickSight 中构建了一个表,但查看者希望在数据透视表中组织数据的方式,我只能使用以下 SQL 片段来实现。保持原样没有问题,但我想看看是否有更好的方法来完成我所做的事情,特别是不使用组合在一起的每个单独表中的重复部分。实际上我正在查询的“问题”列很多,所以实际的 SQL 非常长并且非常重复。
唯一更改的部分是每个表之间的
problem
和 problem_count
。
'Service' AS problem,
COUNT(CASE WHEN service_yn = 'Yes' THEN 1 END) AS problem_count
很抱歉,我不知道如何正确格式化数据透视表,但我会在这里尽力而为。想象一下以下 3 个表是一个大型数据透视表。
基地 | 数 |
---|---|
账单准确性 | 110 |
服务 | 200 |
清洁度 | 95 |
价值 | 75 |
高级版 | 数 |
---|---|
账单准确性 | 110 |
服务 | 200 |
清洁度 | 95 |
价值 | 75 |
豪华房 | 数 |
---|---|
账单准确性 | 110 |
服务 | 200 |
清洁度 | 95 |
价值 | 75 |
SELECT CASE
WHEN descriptor = 'Fast Food' THEN 'Base'
WHEN descriptor IN ('Sit Down','Bar','Eatery') THEN 'Premium'
WHEN descriptor IN ('Full Service','Boutique') THEN 'Deluxe'
ELSE NULL
END AS "brand_group",
'Accuracy of bill' AS problem,
COUNT(CASE WHEN accuracy_of_bill_yn = 'Yes' THEN 1 END) AS problem_count
FROM surveys
WHERE responsedate >= CURRENT_DATE-INTERVAL '12 months'
AND surveyid NOT IN (SELECT surveyid FROM surveys)
AND region IN (1,2,3,4,5)
AND brand_group IS NOT NULL
GROUP BY brand_group
UNION ALL
SELECT CASE
WHEN descriptor = 'Fast Food' THEN 'Base'
WHEN descriptor IN ('Sit Down','Bar','Eatery') THEN 'Premium'
WHEN descriptor IN ('Full Service','Boutique') THEN 'Deluxe'
ELSE NULL
END AS "brand_group",
'Cleanliness' AS problem,
COUNT(CASE WHEN cleanliness_yn = 'Yes' THEN 1 END) AS problem_count
FROM surveys
WHERE responsedate >= CURRENT_DATE-INTERVAL '12 months'
AND surveyid NOT IN (SELECT surveyid FROM surveys)
AND region IN (1,2,3,4,5)
AND brand_group IS NOT NULL
GROUP BY brand_group
UNION ALL
SELECT CASE
WHEN descriptor = 'Fast Food' THEN 'Base'
WHEN descriptor IN ('Sit Down','Bar','Eatery') THEN 'Premium'
WHEN descriptor IN ('Full Service','Boutique') THEN 'Deluxe'
ELSE NULL
END AS "brand_group",
'Service' AS problem,
COUNT(CASE WHEN service_yn = 'Yes' THEN 1 END) AS problem_count
FROM surveys
WHERE responsedate >= CURRENT_DATE-INTERVAL '12 months'
AND surveyid NOT IN (SELECT surveyid FROM surveys)
AND region IN (1,2,3,4,5)
AND brand_group IS NOT NULL
GROUP BY brand_group
UNION ALL
SELECT CASE
WHEN descriptor = 'Fast Food' THEN 'Base'
WHEN descriptor IN ('Sit Down','Bar','Eatery') THEN 'Premium'
WHEN descriptor IN ('Full Service','Boutique') THEN 'Deluxe'
ELSE NULL
END AS "brand_group",
'Value' AS problem,
COUNT(CASE WHEN value_yn = 'Yes' THEN 1 END) AS problem_count
FROM surveys
WHERE responsedate >= CURRENT_DATE-INTERVAL '12 months'
AND surveyid NOT IN (SELECT surveyid FROM surveys)
AND region IN (1,2,3,4,5)
AND brand_group IS NOT NULL
GROUP BY brand_group
您需要“pivot”的反向操作,又名“unpivot”。在单个
CROSS JOIN LATERAL
中获取所有计数后,可以使用 VALUES
优雅地完成 SELECT
表达式。
对表进行一次扫描应该会快得多。
在这样做的同时,我还优化了其他一些事情。
SELECT brand_group, p.*
FROM (
SELECT CASE descriptor
WHEN 'Fast Food' THEN 'Base'
WHEN 'Sit Down' THEN 'Premium'
WHEN 'Bar' THEN 'Premium'
WHEN 'Eatery' THEN 'Premium'
WHEN 'Full Service' THEN 'Deluxe'
WHEN 'Boutique' THEN 'Deluxe'
END AS brand_group
, count(*) FILTER (WHERE accuracy_of_bill_yn = 'Yes') AS a_ct
, count(*) FILTER (WHERE cleanliness_yn = 'Yes') AS c_ct
, count(*) FILTER (WHERE service_yn = 'Yes') AS s_ct
, count(*) FILTER (WHERE value_yn = 'Yes') AS v_ct
FROM surveys
WHERE responsedate >= now() - interval '12 months'
-- AND surveyid NOT IN (SELECT surveyid FROM surveys) -- ?? nonsense
AND region IN (1,2,3,4,5)
AND brand_group IS NOT NULL
GROUP BY 1
) sub
CROSS JOIN LATERAL (
VALUES
('Accuracy of bill', a_ct)
, ('Cleanliness' , c_ct)
, ('Service' , s_ct)
, ('Value' , v_ct)
) p(base, count)
ORDER BY 1, 2; -- need that?
参见:
这个过滤器是一个逻辑矛盾,永远不能返回任何行。问题中一定有拼写错误。我把它注释掉了。修复它:
AND surveyid NOT IN (SELECT surveyid FROM surveys)
您所有的
*_yn
列都应该是 boolean
。不是 text
或其他任何东西。
关于聚合
FILTER
条款:
我使用“切换”
CASE
。应该会便宜一些。参见:
通常,人们只是希望从基本查询中获得更紧凑的结果(可能具有更具描述性的列名称):
SELECT CASE descriptor
WHEN 'Fast Food' THEN 'Base'
WHEN 'Sit Down' THEN 'Premium'
WHEN 'Bar' THEN 'Premium'
WHEN 'Eatery' THEN 'Premium'
WHEN 'Full Service' THEN 'Deluxe'
WHEN 'Boutique' THEN 'Deluxe'
END AS brand_group
, count(*) FILTER (WHERE accuracy_of_bill_yn = 'Yes') AS a_ct
, count(*) FILTER (WHERE cleanliness_yn = 'Yes') AS c_ct
, count(*) FILTER (WHERE service_yn = 'Yes') AS s_ct
, count(*) FILTER (WHERE value_yn = 'Yes') AS v_ct
FROM surveys
-- WHERE ...
GROUP BY 1;