带有Python的p值的Seaborn相关矩阵

问题描述 投票:0回答:1

我有一个在seaborn中产生的对角相关矩阵。我想掩盖p值大于0.05的那些。

这是我所拥有的https://imgur.com/ljwj0U2

sns.set(style="white")
corr = result.corr()
print corr

mask = np.zeros_like(corr, dtype=np.bool)
mask[np.triu_indices_from(mask)] = True
f, ax = plt.subplots(figsize=(11, 9))
sns_plot = sns.heatmap(result.corr(),mask=mask, annot=True, center=0, square=True, fmt=".1f", linewidths=.5, cmap="Greens")

非常感谢您提供的任何帮助。非常感谢

python matrix seaborn correlation analysis
1个回答
0
投票

为了完整起见,这是一个使用scipy.stats.pearsonrdocs)创建p值矩阵的解决方案。创建布尔掩码以传递给seaborn之后(或与numpy np.triu组合以隐藏相关的上三角)]

def corr_sig(df=None):
    p_matrix = np.zeros(shape=(df.shape[1],df.shape[1]))
    for col in df.columns:
        for col2 in df.drop(col,axis=1).columns:
            _ , p = stats.pearsonr(df[col],df[col2])
            p_matrix[df.columns.to_list().index(col),df.columns.to_list().index(col2)] = p
    return p_matrix

p_values = corr_sig(df)
mask = np.triu(p_values < .05)


带有示例的完整过程

首先创建一些样本数据(3个相关变量; 3个不相关变量):

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from scipy import stats
# Simulate 3  correlated variables
num_samples = 100
mu = np.array([5.0, 0.0, 10.0])
# The desired covariance matrix.
r = np.array([
        [  3.40, -2.75, -2.00],
        [ -2.75,  5.50,  1.50],
        [ -2.00,  1.50,  1.25]
    ])
y = np.random.multivariate_normal(mu, r, size=num_samples)
df = pd.DataFrame(y)
df.columns = ["Correlated1","Correlated2","Correlated3"]
for i in range(2):
    df.loc[:,f"Uncorrelated{i}"] = np.random.randint(-2000,2000,len(df))
# To make sure add also an invariant variables
df.loc[:"Invariant"] = -99

为了让您理解这里的相关关系,这是我的绘图函数和示例性相关矩阵的图像。def plot_cor_matrix(corr, mask=None): f, ax = plt.subplots(figsize=(11, 9)) sns.heatmap(corr, ax=ax, mask=mask, # cosmetics annot=True, vmin=-1, vmax=1, center=0, cmap='coolwarm', linewidths=2, linecolor='black', cbar_kws={'orientation': 'horizontal'})

具有所有相关性

# Plotting without significance filtering corr = df.corr() mask = np.triu(corr) plot_cor_matrix(corr,mask) plt.show()
enter image description here

仅信号相关性

最后仅以显着的p值相关性进行绘制(alpha <.05># Plotting with significance filter corr = df.corr() # get correlation p_values = corr_sig(df) # get p-Value mask = np.invert(np.tril(p_values<0.05)) # mask - only get significant corr plot_cor_matrix(corr,mask)
enter image description here
© www.soinside.com 2019 - 2024. All rights reserved.