SQL列出其他产品购买和计数买家,按产品购买

问题描述 投票:1回答:2

经过多年阅读答案,我终于有时间提出问题。

我有一份购买的产品清单和唯一的客户ID:

+---------+--------+
| Product | Buyer  |
+---------+--------+
| Apples  | Rod    |
| Apples  | Jane   |
| Apples  | Freddy |
| Bananas | Rod    |
| Bananas | Jane   |
| Bananas | Freddy |
| Bananas | Zippy  |
| Pears   | Rod    |
| Pears   | Zippy  |
+---------+--------+

我想在Netezza SQL中生成以下输出:

+-----------+-------------+------------------------+---------------------+
| Product A | Buyers of A | A Buyers Also Bought B | No of A Buyers of B |
+-----------+-------------+------------------------+---------------------+
| Apples    |           3 | Bananas                |                   3 |
| Apples    |           3 | Pears                  |                   1 |
| Bananas   |           4 | Apples                 |                   3 |
| Bananas   |           4 | Pears                  |                   2 |
| Pears     |           2 | Apples                 |                   1 |
| Pears     |           2 | Bananas                |                   2 |
+-----------+-------------+------------------------+---------------------+

..因此,我可以看到,对于每种产品,总购买者。至关重要的是,我还想看到,对于每个产品,这些购买者中,有多少人在同一个清单中购买了其他产品。编辑:重要的是要重申,如果他们不购买产品A,我不应该让任何买家出现在B栏中。

请问最有效的方法是什么?

(然后我会计算出B购买A的百分比,但这部分很容易)。

谢谢!

sql netezza market-basket-analysis
2个回答
0
投票

您可以创建计数摘要,然后与自身交叉连接,不包括相同的匹配。

像这样:

SELECT 
    A.Product,
    A.Buyers,
    B.Product,
    B.Buyers
FROM (
    SELECT
        Product
        count(*) AS Buyers
    FROM
        ProductBuyers
    GROUP BY
) AS A
CROSS JOIN (
    SELECT
        Product
        count(*) AS Buyers
    FROM
        ProductBuyers
    GROUP BY
) AS B
WHERE 
    A.Product != B.Product

0
投票

有关购买的基本数据是自我加入和group by

select p1.product, p2.product, count(*) as in_common
from purchases p1 join
     purchases p2
     on p1.buyer = p2.buyer
group by p1.product, p2.product;

为了得到一个(或另一个)的计数是join

select p1.product, p2.product, pp.cnt, count(*) as in_common
from purchases p1 join
     purchases p2
     on p1.buyer = p2.buyer join
     (select p1.product, count(*) as cnt
      from purchases
      group by p1.product
     ) pp
     on pp.product = p1.product
group by p1.product, p2.product, pp.cnt;

或者,您可以使用窗口函数:

select p1.product, p1.cnt, p2.product, count(*) as in_common
from (select p1.*,
             count(*) over (partition by p1.product) as cnt
      from purchases p1
     ) p1 join
     purchases p2
     on p1.buyer = p2.buyer
group by p1.product, p2.product, p1.cnt;

Here是一个显示它工作的rextester。

© www.soinside.com 2019 - 2024. All rights reserved.