我有以下数据框:
0 ['1','2,'3'] ['5','6','2'] ['2','5']
1 ['2','3'] ['2','3'] ['1']
我想找到col1,col2和col3的笛卡尔积。有人可以帮忙。数据框中的列数在运行时生成。a [0] [0] * a [0] [1] * a [0] [2]a [0] [0] * a [0] [1] * a [1] [2]等等相交操作后的预期输出:2组()2组()2组()2set()
生成序列笛卡尔积的函数是itertools.product
。将其应用于一行将获得特定于该行序列的笛卡尔乘积,我想这就是您想要的。
import pandas as pd
from itertools import product
# The dataframe you supplied
df = pd.DataFrame([[[1, 2, 3], [5, 6, 2], [2, 5]],
[[2, 3], [2, 3], [1]]])
print(df)
# wrap it in a function
def cartesian(row):
# We use list because product defines a generator
# We use unpack (*) because otherwise we get the product of the tuples themselves
return list(product(*row))
# Demonstration on the first row.
first_row = cartesian(df.iloc[0])
print(first_row)
# Apply the function to the dataframe.
# Be sure to supply the axis, otherwise you will get the product over columns.
result = df.apply(cartesian, axis=1)
print(result)
assert result[0] == first_row
您还可以将整个内容包装在列表理解中,但是通常使用apply
。