计算每个数据帧列中的唯一值并转换为字典

Question

data = {'Col1': [1, 2, 2, 3, 1],
    'Col2': ['A', 'B', 'B', 'A', 'C']}
df = pd.DataFrame(data)

我想要一本像这样的字典：

{'Col1': {1:2, 2:2, 3:1},
 'Col2': {'A':2, 'B':2, 'C':1}

不使用任何类型的

loop

、

apply

或

agg

方法。

我尝试过这样的事情：

count_matrix = df.stack().groupby(level=1).value_counts()
count_matrix = count_matrix.unstack(0)
count_matrix = count_matrix.to_dict()

但它不起作用，因为它在拆垛时插入

nan

值来填充空白。

Answer 1

一个可能的解决方案：

{col: df[col].value_counts().to_dict() for col in df}

输出：

{'Col1': {1: 2, 2: 2, 3: 1}, 'Col2': {'A': 2, 'B': 2, 'C': 1}}

Answer 2

如果你想使用没有显式循环的纯 pandas，你可以使用

agg

、

value_counts

和

to_dict

：

df.agg(lambda x: x.value_counts().to_dict()).to_dict()

或者，使用

map

覆盖循环限制：

dict(map(lambda x: (x[0], x[1].value_counts().to_dict()), df.items()))

输出：

{'Col1': {1: 2, 2: 2, 3: 1}, 'Col2': {'A': 2, 'B': 2, 'C': 1}}