我具有以下数据框mmLog:
Experiment Logmm
0 Spontaneous1 0.022815
1 Light1 0.007222
2 PTZ1 0.03168
3 Spontaneous1 0.015003
4 Light1 0.013402
5 PTZ1 0.021539
... ... ...
38072 SpontaneousControl147 0.013685
38073 SpontaneousControl147 0.034702
38074 SpontaneousControl147 0.008993
我想对每个唯一组进行一次ttest,并将其与“实验”列中的对照组进行比较。我尝试创建唯一标识符数据帧的字典
df_uniq = dict()
for k, v in mmLog.groupby('Experiment'):
df_uniq[k] = v
然后使用for循环
from scipy.stats import ttest_ind
for key in df_uniq:
cat1 = key
cat2 = df[df['Experiment']=='SpontaneousControl147']
ttest_ind(cat1['mm Log10(n+1)'], cat2['mm Log10(n+1)'])
并获得TypeError:字符串索引必须为整数
您想将字典中的值而不是其键分配给cat1
:
from scipy.stats import ttest_ind
for val in df_uniq.values():
cat1 = val
cat2 = df[df['Experiment']=='SpontaneousControl147']
ttest_ind(cat1['mm Log10(n+1)'], cat2['mm Log10(n+1)'])
通过将键分配给cat1
,您正在尝试对字符串而不是groupby结果执行T检验。