我正在 Impala SQL 中创建一个查询,该查询需要在聚合数据之前对数据执行操作。这是我的查询:
With concatenated_addresses As (
Select site_name, concat(parent_address_line_1, coalesce(parent_address_line_2," "), coalesce(parent_address_line_3," "), coalesce(parent_address_line_4," ")) as concated_address
From locations_all_vw
)
Select l.site_name, min(l.parent_address_region) as region, group_concat(distinct c.concated_address, " | ") as address_line_1,
min(l.parent_city) as city, min(l.parent_cntry_code) as city_code, min(l.parent_county) as country, min(l.parent_state_province) as state_province, min(parent_state_province_code) as province_code, min(parent_location_status) as status,
min(l.parent_location_sub_type) as location_subtype, min(l. parent_location_type) as location_type, min(l.parent_longitude) as longitude, min(l.parent_latitude) as latitue,
min(l.parent_postal_code) as postal_code, min(l.parent_postal_code_ext) as postal_code_ext, group_concat(distinct l.source_system_code, ", ") as source_system, group_concat( distinct l.business_group_description, ", ") as business_group
from locations_all_vw l
INNER JOIN concatenated_addresses c
ON l.site_name = c.site_name
GROUP BY l.site_name
查询首先将地址字段连接到 CTE 中的 1,将该 CTE 连接到实际表,并对所有内容进行分组。我这样做是为了从多个列中获取单个地址,然后获取这些聚合地址的不同值。
查询可以工作,但速度相当慢(表有超过 100.000 行)。我不是 SQL 专家,所以我想知道是否有更高效的方法来获得我需要的东西。
谢谢!
我创建了查询并且它有效,但我想获得更好的查询
您可以通过两种方式进行 -
Select l.site_name, min(l.parent_address_region) as region, group_concat(distinct concat(parent_address_line_1, coalesce(parent_address_line_2," "), coalesce(parent_address_line_3," "), coalesce(parent_address_line_4," ")), " | ") as address_line_1,
min(l.parent_city) as city, min(l.parent_cntry_code) as city_code, min(l.parent_county) as country, min(l.parent_state_province) as state_province, min(parent_state_province_code) as province_code, min(parent_location_status) as status,
min(l.parent_location_sub_type) as location_subtype, min(l. parent_location_type) as location_type, min(l.parent_longitude) as longitude, min(l.parent_latitude) as latitue,
min(l.parent_postal_code) as postal_code, min(l.parent_postal_code_ext) as postal_code_ext, group_concat(distinct l.source_system_code, ", ") as source_system, group_concat( distinct l.business_group_description, ", ") as business_group
from locations_all_vw l
GROUP BY l.site_name
With loc_cte As (
Select l.*, concat(parent_address_line_1, coalesce(parent_address_line_2," "), coalesce(parent_address_line_3," "), coalesce(parent_address_line_4," ")) as concated_address
From locations_all_vw l
)
Select l.site_name, min(l.parent_address_region) as region, group_concat(distinct concated_address, " | ") as address_line_1,
min(l.parent_city) as city, min(l.parent_cntry_code) as city_code, min(l.parent_county) as country, min(l.parent_state_province) as state_province, min(parent_state_province_code) as province_code, min(parent_location_status) as status,
min(l.parent_location_sub_type) as location_subtype, min(l. parent_location_type) as location_type, min(l.parent_longitude) as longitude, min(l.parent_latitude) as latitue,
min(l.parent_postal_code) as postal_code, min(l.parent_postal_code_ext) as postal_code_ext, group_concat(distinct l.source_system_code, ", ") as source_system, group_concat( distinct l.business_group_description, ", ") as business_group
from loc_cte l
GROUP BY l.site_name