pandas DataFrame（python）中的Z分数标准化

Question

我正在使用python3（spyder），并且有一个表，它是对象“ pandas.core.frame.DataFrame”的类型。我想对该表中的值进行z得分归一化（将每个值减去其行的均值并除以其行的sd），因此每一行的均值= 0，sd = 1。我尝试了2种方法。

第一种方法

from scipy.stats import zscore
zetascore_table=zscore(table,axis=1)

第二种方法

rows=table.index.values
columns=table.columns
import numpy as np
for i in range(len(rows)):
    for j in range(len(columns)):
         table.loc[rows[i],columns[j]]=(table.loc[rows[i],columns[j]] - np.mean(table.loc[rows[i],]))/np.std(table.loc[rows[i],])
table

两种方法似乎都有效，但是当我检查每行的均值和sd时，它不是0和1（假定是），而是其他浮点值。我不知道这可能是问题。

谢谢您的帮助！

Answer 1

抱歉，考虑到这一点，我发现自己比for循环更容易地计算z分数（减去每行的平均值并将结果除以该行的sd）：

table=table.T# need to transpose it since the functions work like that 
sd=np.std(table)
mean=np.mean(table)
numerator=table-mean #numerator in the formula for z-score 
z_score=numerator/sd
z_norm_table=z_score.T #we transpose again and we have the initial table but with all the 
#values z-scored by row.

我检查了一下，现在的意思是每行都为0或非常接近0，而sd为1或非常接近1，所以像那样对我有用。抱歉，我几乎没有编码经验，有时候容易的事情需要大量试验，直到我弄清楚如何解决它们。

pandas DataFrame（python）中的Z分数标准化

问题描述投票：0回答：1

1个回答

最新问题

pandas DataFrame（python）中的Z分数标准化

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1