试图排除占星学,但有些不对劲

问题描述 投票:0回答:2

我试图排除占星学对人口可能产生的影响,因为这种影响在统计上微不足道,但无济于事。我正在使用皮尔逊卡方检验对来自两个不同人群(其中之一是宇航员飞行员,另一个是名人)的太阳星座的两种分布进行检验。一定有什么问题,但我没能找到它,可能是在统计方面。

import numpy as np
import pandas as pd
import ephem
from collections import Counter, namedtuple
import matplotlib.pyplot as plt
from scipy import stats
models = pd.read_csv('models.csv', delimiter=',')
astronauts = pd.read_csv('astronauts.csv', delimiter=',')
models = models.sample(229)
astronauts = astronauts.sample(229)

sun = ephem.Sun()


def get_planet_constellation(planet, dataset):
    person_planet_constellation = []
    for person in dataset['Birth Date']:
        planet.compute(person)
        person_planet_constellation += [ephem.constellation(planet)[1]]
    return person_planet_constellation


def plot_bar_group(planet, data1, data2):
    fig, ax = plt.subplots()
    plt.bar(data1.keys(), data1.values(), alpha=0.5)
    plt.bar(data2.keys(), data2.values(), alpha=0.5)
    plt.legend(['astronauts', 'models'])
    ylabel = 'Percentages of ' + planet.name + ' in constellation'
    ax.set_ylabel(ylabel)
    title = 'Histogram of ' + planet.name + ' in constellation by group'
    ax.set_title(title)
    plt.show()


astronaut_sun_constellation = Counter(
    get_planet_constellation(sun, astronauts))
model_sun_constellation = Counter(get_planet_constellation(sun, models))


plot_bar_group(sun, astronaut_sun_constellation, model_sun_constellation)

a = list(astronaut_sun_constellation.values())
b = list(model_sun_constellation.values())
s = np.array([a, b])

stat, p, dof, expected = stats.chi2_contingency(s)
print(stat, p, dof, expected)

prob = 0.95
critical = stats.chi2.ppf(prob, dof)
if abs(stat) >= critical:
    print('Dependent (reject H0)')
else:
    print('Independent (fail to reject H0)')

# interpret p-value
alpha = 1.0 - prob
if p <= alpha:
    print('Dependent (reject H0)')
else:
    print('Independent (fail to reject H0)')

https://www.dropbox.com/s/w7rye6m5lbihjlh/astronauts.csv https://www.dropbox.com/s/xlxanr0pxqtxcvv/models.csv

statistics chi-squared
2个回答
0
投票

我最终发现了这个错误,它是将计数器作为列表传递给卡方函数,必须首先对其进行排序,否则卡方会看到计数器值的主要差异。现在所有占星效应都如预期的 0.95

水平微不足道

0
投票

作为一个想法,你可能会从月球、木星、火星,最后是金星的角度来看待它。

只是一个想法。

© www.soinside.com 2019 - 2024. All rights reserved.