我使用 statsmodels.stats.inter_rater.fleiss_kappa 来计算我的评估者间可靠性。我只得到 kappa 值。如果我需要 z 值、p 值和范围怎么办?
我给你的建议是重新打开你的统计讲义并查看公式。这是我经常传授给学生的标准练习:
import numpy as np
import pandas as pd
from statsmodels.stats.inter_rater import fleiss_kappa
from scipy.stats import norm
np.random.seed(42)
data = {
f'Item{i+1}': np.random.choice([0, 1, 2], size=30, p=[0.33, 0.33, 0.34]) for i in range(15)
}
df = pd.DataFrame(data)
formatted_data = {
f"Category {cat}": [(df[item] == cat).sum() for item in df] for cat in range(3)
}
formatted_df = pd.DataFrame(formatted_data)
kappa = fleiss_kappa(formatted_df.values)
category_totals = formatted_df.sum(axis=1)
p = np.sum((category_totals / (30 * 15))**2)
n = 15
k = 3
N = n * 30
variance = (1 / (N * (n - 1))) * (N * p * (1 - p) + (n * (k - 1) * (p - (1 / k)**2)))
if variance > 0:
z_value = kappa / np.sqrt(variance)
p_value = 2 * (1 - norm.cdf(np.abs(z_value)))
z_critical = norm.ppf(0.975)
margin_of_error = z_critical * np.sqrt(variance)
lower_bound = kappa - margin_of_error
upper_bound = kappa + margin_of_error
print("Fleiss' kappa:", kappa)
print("Z-value:", z_value)
print("P-value:", p_value)
print("Confidence interval (95%):", (lower_bound, upper_bound))
else:
print("Variance calculation error: Non-positive variance", variance)
这给出了
Fleiss' kappa: -0.008536683290635389
Z-value: -0.1312124600755962
P-value: 0.8956072394628303
Confidence interval (95%): (-0.13605194965657783, 0.11897858307530704)