为什么这个 crosstab() 查询返回重复的键?

问题描述 投票:0回答:1

我有一张名为

sample_events
的表格:

 Column | Type
--------+-----
 title  | text
 date   | date

价值观:

 title |     date
-------+------------
 ev1   | 2017-01-01
 ev2   | 2017-01-03
 ev3   | 2017-01-02
 ev4   | 2017-12-10
 ev5   | 2017-12-11
 ev6   | 2017-07-28

为了创建一个包含每个唯一年份每月事件数的数据透视表,我使用了以下形式的交叉表函数

crosstab(text source_sql, text category_sql)
:

SELECT * FROM crosstab (
   'SELECT extract(year from date) AS year,
        extract(month from date) AS month, count(*)
    FROM sample_events
    GROUP BY year, month'
,
   'SELECT * FROM generate_series(1, 12)'
) AS (
    year int, jan int, feb int, mar int,
    apr int, may int, jun int, jul int,
    aug int, sep int, oct int, nov int, dec int
) ORDER BY year;

结果如下,符合预期:

 year | jan | feb | mar | apr | may | jun | jul | aug | sep | oct | nov | dec
------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+----
 2017 |   3 |     |     |     |     |     |   1 |     |     |     |     |   2

现在,我想创建一个数据透视表,其中包含一年中每个独特周中每周每天的事件数。我尝试了以下查询:

SELECT * FROM crosstab (
   'SELECT extract(week from date) AS week,
        extract(dow from date) AS day_of_week, count(*)
    FROM sample_events
    GROUP BY week, day_of_week'
,
   'SELECT * FROM generate_series(0, 6)'
) AS (
    week int, sun int, mon int, tue int,
    wed int, thu int, fri int, sat int
) ORDER BY week;

结果不符合预期:

 week | sun | mon | tue | wed | thu | fri | sat 
------+-----+-----+-----+-----+-----+-----+-----
    1 |     |     |   1 |     |     |     |    
    1 |     |   1 |     |     |     |     |    
   30 |     |     |     |     |     |   1 |    
   49 |   1 |     |     |     |     |     |    
   50 |     |   1 |     |     |     |     |    
   52 |   1 |     |     |     |     |     |    

所有六个事件都在那里,但无论出于何种原因,都有重复的周值。我预计结果会是这样的:

 week | sun | mon | tue | wed | thu | fri | sat 
------+-----+-----+-----+-----+-----+-----+-----
    1 |     |   1 |   1 |     |     |     |    
   30 |     |     |     |     |     |   1 |    
   49 |   1 |     |     |     |     |     |    
   50 |     |   1 |     |     |     |     |    
   52 |   1 |     |     |     |     |     |    

问题

1)为什么后一个查询的结果包含重复的键值而前一个查询的结果不包含重复的键值?

2) 如何创建具有唯一周值的数据透视表?

postgresql date crosstab
1个回答
6
投票

crosstab()
需要有序输入。您需要在输入中添加
ORDER BY

SELECT * FROM crosstab (
   'SELECT extract(week from date)::int AS week
         , extract(dow  from date)::int AS day_of_week
         , count(*)::int
    FROM   sample_events
    GROUP  BY week, day_of_week
    ORDER  BY week, day_of_week'
 , 'SELECT generate_series(0, 6)'
   ) AS (
    week int, sun int, mon int, tue int,
    wed int, thu int, fri int, sat int
);

或者只是

ORDER  BY week

严格来说,相同键的值(示例中的

week
)需要分组(按顺序排列)。钥匙不必订购。但实现这一目标的最简单、最便宜的方法是
ORDER BY
(另外对键进行排序)。

或简称:

SELECT * FROM crosstab (
   'SELECT extract(week from date)::int
         , extract(dow  from date)::int
         , count(*)::int
    FROM   sample_events
    GROUP  BY 1, 2
    ORDER  BY 1, 2'  -- or just ORDER BY 1
 ,  'SELECT generate_series(0, 6)'
) AS ...

您的第一个带有月份的示例恰好可以工作,因为输入数据有连续的月份。但是,如果表中行的物理顺序发生变化(

VACUUM
UPDATE
,...),这可能会随时中断。您永远不能依赖关系表中行的物理顺序。

参见:

© www.soinside.com 2019 - 2024. All rights reserved.