在一个数据帧我想计数每一列的值,并使用该值作为指标。
我想关闭这个:
Q1 Q2 Q3
Strongly agree Agree Undecided
Undecided Agree More or less disagree
Strongly agree Agree Undecided
Strongly agree Strongly Disagree Disagree
More or less agree Undecided Strongly disagree
这个:
Q1 Q2 Q3
Strongly agree 3 0 0
Agree 0 3 0
More or less agree 1 0 0
Undecided 1 1 2
More or less disagree 0 0 1
Disagree 0 0 1
Strongly disagree 0 1 1
这怎么可能有熊猫吗?
如果你坚持value_counts
,您可以使用stack
和groupby
事先然后value_counts
之前调用unstacking
:
df.stack().groupby(level=[1]).value_counts().unstack(0, fill_value=0)
Q1 Q2 Q3
Agree 0 3 0
Disagree 0 0 1
More or less agree 1 0 0
More or less disagree 0 0 1
Strongly Disagree 0 1 0
Strongly agree 3 0 0
Strongly disagree 0 0 1
Undecided 1 1 2
另一种选择是使用melt
和pivot_table
:
(df.melt()
.pivot_table(columns='variable', index='value', aggfunc='size', fill_value=0))
variable Q1 Q2 Q3
value
Agree 0 3 0
Disagree 0 0 1
More or less agree 1 0 0
More or less disagree 0 0 1
Strongly Disagree 0 1 0
Strongly agree 3 0 0
Strongly disagree 0 0 1
Undecided 1 1 2
解决方案使用crosstab
:
v = df.melt()
pd.crosstab(v['value'], v['variable'])
variable Q1 Q2 Q3
value
Agree 0 3 0
Disagree 0 0 1
More or less agree 1 0 0
More or less disagree 0 0 1
Strongly Disagree 0 1 0
Strongly agree 3 0 0
Strongly disagree 0 0 1
Undecided 1 1 2
您可以将pd.Series.value_counts
功能对整个数据框,并填写NaN
值0。
df.apply(pd.Series.value_counts,axis=0).fillna(0)