我有一个表,其中 ID 列的每个值都有一行或多行。现在我想旋转该表并通过 SQL 中的串联来聚合值列的结果值。
下面是我想要使用 R 和 tidyverse 实现的示例。每个人可以与一个或多个其他人建立特定的关系类型:
library(tidyverse)
df <- data.frame(
person = c("Tina", "Tina", "Tina", "Rachel", "Rachel", "Rachel"),
relationship_with = c("George", "Jamal", "Thomas", "Joe", "Taylor", "Bruce"),
relationship_type = c("friend", "friend", "coworker", "friend", "coworker","coworker")
)
df |> pivot_wider(names_from = relationship_type,
values_from = relationship_with,
values_fn = ~paste(., collapse=","))
# A tibble: 2 × 3
person friend coworker
<chr> <chr> <chr>
1 Tina George,Jamal Thomas
2 Rachel Joe Taylor,Bruce
在 Teradata Studio 中使用 SQL 我会尝试以这种方式做同样的事情。
WITH cte AS (
SELECT person, relationship_with, relationship_type
FROM df
)
SELECT *
FROM cte
PIVOT (
CONCAT(relationship_with) FOR relationship_type IN (
'friend',
'coworker'
)
) AS pivot;
不幸的是,上面的代码出现语法错误。似乎必须在 PIVOT 中使用 AVG、MEAN、MAX 等 SQL 聚合函数。 PIVOT 内部不允许使用 CONCAT。我怎样才能使用 SQL 实现我想要的目标?
我找到了一种方法,首先使用
xmlagg()
,然后执行 pivot()
:
-- Preselect your columns here
with cte as (
select
person,
relationship_with,
relationship_type
from df
),
-- Create another CTE that performs the aggregation using XMLAGG
aggr as (
select
person,
relationship_type,
trim( trailing ',' from (
xmlagg(trim(relationship_with) || ',' order by person)
)) as relationship_with
from cte
group by person, relationship_type
)
-- Pivot the aggregated table
-- Using max() here, but this is just because pivot() requires an aggregation function
select
*
from aggr
pivot (max(ZuPartner) FOR Beziehungstyp IN (
'friend', 'coworker'
)
) pivo;