将相关函数应用于数据帧的多个子集并在一帧中合并结果

Question

我有一个名为“ df”的熊猫数据框，其以下各列：

    Income  Income_Quantile Score_1 Score_2 Score_3
0   100000              5     75      75    100
1   70000               4     55      77    80
2   50000               3     66      50    60
3   12000               1     22      60    30
4   35000               2     61      50    53
5   30000               2     66      35    77

我还有一个“ for-loop”，用于使用“ Income_Quantile”变量选择数据帧的子集。循环随后会删除用于切片主数据帧的“ Income_Quantile”变量； “ df”。

这里是代码：

for level in df.Income_Quantile.unique():
    df_s = df.loc[df.Income_Quantile == level].drop('Income_Quantile', 1)

现在，我要计算“收入”变量与“ df_s”中的“ Score_1”，“ Score_2”和“ Score_3”变量的spearman等级相关性。

我还想将结果串联在一个框架中，结构如下：

            Income Quantile  Score_1    Score_2     Score_3
correlation         ….         ….          ….          ….
p-value             ….         ….          ….          ….
t-statistic         ….         ….          ….          ….

我认为下面的方法基于我问到的previous question可能会有所帮助：

result = dict({key: correlations(val) for key, val in df_s.items()}) '''"correlations" will be a helper function for calculating the Spearman's rank correlation of each of the subsets to the "Income" variable and outputing the p-value and t-statistic of the test for each each variable.'''

但是，我目前不知道如何实现下一步。

有人对我如何从目前的位置到达想要的位置有任何指示吗？ 这恰好是我在Python中最薄弱的地方，被困住了。

Answer 1

这是您的期望吗？

cols = ['Score_1','Score_2','Score_3']
df_result = pd.DataFrame(columns=cols)
df_result.loc['t-statistic'] = [ttest_ind(df['Income'], df[x])[0] for x in cols]
df_result.loc['p-value'] = [ttest_ind(df['Income'], df[x])[1] for x in cols]
df_result.loc['correlation']= [spearmanr(df['Income'], df[x])[1] for x in cols]
print(df_result)

输出：

              Score_1   Score_2   Score_3
t-statistic  3.842307  3.842281  3.841594
p-value      0.003253  0.003253  0.003257
correlation  0.257369  0.227784  0.041563

这里df_result['Score_1']是df['Income']和df['Score_1']的t统计量，p值和spearman相关性的结果。让我知道是否有帮助。

将相关函数应用于数据帧的多个子集并在一帧中合并结果

问题描述投票：0回答：1

1个回答

最新问题

将相关函数应用于数据帧的多个子集并在一帧中合并结果

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1