Dataframe:跨列的笛卡尔积

问题描述 投票:0回答:1

我有以下数据框:

0   ['1','2,'3']     ['5','6','2']    ['2','5']
1   ['2','3']        ['2','3']        ['1']

我想找到col1,col2和col3的笛卡尔积。有人可以帮忙。数据框中的列数在运行时生成。a [0] [0] * a [0] [1] * a [0] [2]a [0] [0] * a [0] [1] * a [1] [2]等等相交操作后的预期输出:2组()2组()2组()2set()

python pandas dataframe cartesian-product
1个回答
0
投票

生成序列笛卡尔积的函数是itertools.product。将其应用于一行将获得特定于该行序列的笛卡尔乘积,我想这就是您想要的。

import pandas as pd
from itertools import product

# The dataframe you supplied
df = pd.DataFrame([[[1, 2, 3], [5, 6, 2], [2, 5]], 
                   [[2, 3], [2, 3], [1]]])
print(df)

# wrap it in a function
def cartesian(row):
    # We use list because product defines a generator
    # We use unpack (*) because otherwise we get the product of the tuples themselves
    return list(product(*row))

# Demonstration on the first row.
first_row = cartesian(df.iloc[0])
print(first_row)

# Apply the function to the dataframe.
# Be sure to supply the axis, otherwise you will get the product over columns.
result = df.apply(cartesian, axis=1)
print(result)
assert result[0] == first_row

您还可以将整个内容包装在列表理解中,但是通常使用apply

© www.soinside.com 2019 - 2024. All rights reserved.