刚刚开始探索数据分析的世界。
我有一个包含 43 个表的数据集,所有表都有相同的主键。 我如何从我拥有的 43 个表中创建 1 个表?
连接是做到这一点的唯一方法吗? UNION ALL 用于所有表?有什么技巧可以让它变得更容易并且对新手友好:D。 我正在使用 BigQuery, 谢谢!
Select
AVG(duration)
from
(SELECT
day_of_week,
duration,
member_casual
FROM
(SELECT started_at,
ended_at,
member_casual,
DATETIME_DIFF (ended_at, started_at,minute) as duration,
EXTRACT (DAYOFWEEK from started_at) as day_of_week
FROM `cyclistic-406707.Trips_2020_to_2023.Trips_2023_10`)
WHERE duration >= 0.1
AND duration <=1440)
WHERE member_casual = "casual"
为什么不直接使用 union 将表放在一起呢? 您可以将其用于具有相同列数和数据类型的表:
SELECT_statement UNION SELECT_statement
或者,如果您有不同的结构,您可以合并相同的字段:
SELECT name
FROM customers_1
UNION
SELECT name
FROM customers_2;
您需要一个视图,例如:
CREATE VIEW cyclistic-406707.Trips_2020_to_2023.AllTrips
AS
SELECT *
FROM cyclistic-406707.Trips_2020_to_2023.Trips_2020_01
UNION ALL
SELECT *
FROM cyclistic-406707.Trips_2020_to_2023.Trips_2020_02
UNION ALL
...
UNION ALL
SELECT *
FROM cyclistic-406707.Trips_2020_to_2023.Trips_2023_12;
之后就可以直接VIEW查询了:
SELECT *
FROM cyclistic-406707.Trips_2020_to_2023.AllTrips
WHERE started_at BETWEEN ...
AND member_casual = "casual";
在添加中您可以添加列:
CREATE VIEW cyclistic-406707.Trips_2020_to_2023.AllTrips
AS
SELECT *, DATETIME_DIFF (ended_at, started_at,minute) as duration
FROM cyclistic-406707.Trips_2020_to_2023.Trips_2020_01
...