在 Scipy.mannwhitneyu (scipy.stats.mannwhitneyu) 中处理 NaN 值

问题描述 投票:0回答:1

我有一个 logIC50 值数据集(如 A)和另一个分类临床药物反应数据集(如 B)。 B 中主要有两个重要的观察结果:敏感和耐药。但是B中有很多空(NaN)值。我想使用单侧非参数Mann Whitney U检验来确定耐药肿瘤的估计log(IC50)值是否显着大于敏感肿瘤。但我不知道如何处理这些 NaN 值。

我尝试使用此代码:

from scipy.stats import mannwhitneyu

p_values = {}  # Dictionary to store p-values for each drug

# Assuming logIC50_df contains log IC50 values for different drugs
# Iterate through each column (drug) in logIC50_df
for drug in logIC50_df.columns:
    # Extract log IC50 values for the current drug
    resistant_log_ic50 = logIC50_df[drug][test_dr_new.iloc[:,0] == "Resistant"]
    sensitive_log_ic50 = logIC50_df[drug][test_dr_new.iloc[:,0] == "Sensitive"]
    
    # Perform Mann-Whitney U test (one-sided alternative hypothesis)
    statistic, p_value = mannwhitneyu(resistant_log_ic50, sensitive_log_ic50, alternative='greater')

    # Store p-value for the current drug
    p_values[drug] = p_value

    # Print individual p-value for each drug
    print(f"P-value for {drug}: {p_value}")

# Count the number of drugs with statistically significant discrimination
significance_level = 0.05
num_significant = sum(p < significance_level for p in p_values.values())

# Print the total number of drugs with statistically significant discrimination
print(f"Number of drugs with statistically significant discrimination: {num_significant}")

# Interpretation
if any(p < significance_level for p in p_values.values()):
    print("Reject the null hypothesis: estimated log(IC50) values are significantly higher in resistant tumors compared to sensitive tumors.")
else:
    print("Fail to reject the null hypothesis: there is no significant difference in estimated log(IC50) values between resistant and sensitive tumors.")

#但是每次我用它,结果都很糟糕。我怀疑 B 数据集中的 NaN 值(即 test_dr_new)会影响我的结果。请在这方面提供帮助。

scipy prediction scipy.stats statistical-test
1个回答
0
投票

测试如何处理 NaN 由

nan_policy
参数控制。

  • 默认值为
    nan_policy='propagate'
    ,如果任一样本中存在 NaN,这会导致函数返回 NaN。
  • 如果任一样本中存在 NaN,
  • nan_policy='raise'
    会导致引发错误
  • nan_policy='omit'
    假装 NaN 根本不存在。它的行为就好像您在传递样本之前已将它们删除一样。

请参阅文档

© www.soinside.com 2019 - 2024. All rights reserved.