如何连接多个表并将指标划分到输出行?

问题描述 投票:0回答:1

我正在使用 Google Big Query 来连接多个表。我要加入的三个表如下所示:

表A

品牌 搜索词
福特 福特
福特 食物
标致 标致

表B

标签 搜索词
品牌 品牌
通用 通用

表C

日期 活动名称 点击次数
2024年1月1日 品牌-福特 30
2024年1月1日 品牌 - 福特 - 标致 50

查询的目标是将表A和B中的品牌和标签信息添加到表C。连接应该在campaign_name和searchterm列上完成。当campaign_name中存在搜索词时,相应的品牌或标签将添加到输出中。目前我设置了这个查询:

WITH brands AS (
  SELECT 
  brand,
  CONCAT('%',searchterm,'%') AS searchterm
  FROM Table A
),

labels AS (
  SELECT 
  label,
  CONCAT('%',searchterm,'%') AS searchterm
  FROM Table B
),

FINAL AS (
  SELECT 
  date,
  campaign_name,
  CASE WHEN LOWER(campaign_name) LIKE brands.searchterm THEN brands.brand_name ELSE 'generic' END AS brand,
  CASE WHEN LOWER(campaign_name) LIKE labels.searchterm THEN labels.label ELSE 'no label' END AS label, 
  SUM(clicks) AS clicks,
  FROM Table C`
  LEFT JOIN brands
  ON LOWER(campaign_name) LIKE brands.searchterm 
  LEFT JOIN labels
  ON LOWER(campaign_name) LIKE labels.searchterm
  WHERE 1=1 
  GROUP BY date, campaign_name, brand, label)

SELECT * FROM FINAL

当 Campaign_name 和 searchterm 之间存在一个匹配项时,此查询会给出正确的输出。但是,如果存在多个匹配项,例如表 C 的第二行,则应该有其他输出。在这种情况下,点击的值应除以匹配的数量并分布在各行中。组合表 A、B 和 C 的所需输出应如下所示:

日期 活动名称 品牌 标签 点击次数
2024年1月1日 品牌-福特 福特 品牌 30
2024年1月1日 品牌 - 福特 - 标致 福特 品牌 25
2024年1月1日 品牌 - 福特 - 标致 标致 品牌 25

“点击次数”的价值已除以 2,并归因于福特和标致两个品牌。我怎样才能达到这样的输出?

我使用了不同的方法来解决这个问题,查询的当前状态在描述中。我尝试将点击量指标除以某个值,但我无法弄清楚要除以哪个值。

sql join google-bigquery
1个回答
0
投票

你能试试这个吗?

with TableA as (
          select "ford" as brand,"ford" as searchterm
union all select "ford" as brand, "foord" as searchterm
union all select "peugeot" as brand, "peugeot" as searchterm
union all select "peugeot" as brand, "peugeoot" as searchterm
),

TableB as (
          select "brand" as label,"brand" as searchterm
union all select "generic" as brand, "generic" as searchterm
),

TableC as(
          select '01-01-2024' as date,"brand - ford" as campaign_name, 30 as clicks
union all select '01-01-2024' as date,"brand - ford - peugeot" as campaign_name, 50 as clicks
union all select '01-01-2024' as date,"brand - ford - peugeot - foord" as campaign_name, 30 as clicks
),

brands AS (
  SELECT 
  brand,
  CONCAT('%',searchterm,'%') AS searchterm
  FROM tableA
),

labels AS (
  SELECT 
  label,
  CONCAT('%',searchterm,'%') AS searchterm
  FROM tableB
),


FINAL AS (
  SELECT 
  date,
  campaign_name,
  CASE WHEN LOWER(campaign_name) LIKE brands.searchterm THEN brands.brand ELSE 'generic' END AS brand,
  CASE WHEN LOWER(campaign_name) LIKE labels.searchterm THEN labels.label ELSE 'no label' END AS label, 
  clicks AS clicks,
  COUNT(*) OVER (PARTITION BY campaign_name) as match_count,
  FROM tableC
  LEFT JOIN brands
  ON LOWER(campaign_name) LIKE brands.searchterm 
  LEFT JOIN labels
  ON LOWER(campaign_name) LIKE labels.searchterm
  WHERE 1=1 
  GROUP BY date, campaign_name, brand, label,clicks)


SELECT * EXCEPT (clicks,match_count), clicks/match_count as newClicks FROM FINAL

我删除了点击次数总和,因为它将总点击次数添加到表中,并且还使用

match\_count
作为将点击次数划分到连接表中的数字。

© www.soinside.com 2019 - 2024. All rights reserved.