我想获取当前值之前的所有值并计算它们的偏度,因此最后我可以得到一个系列,其中每行都有最后一个值的累积偏度。 请注意,r 系列也包含零
def skewness_line(r):
"""
Computes the skewness for each row of the supplied DataFrame
Returns a Series with the skewness for each row cummulative
"""
acumulado = pd.Series()
result = []
contador = 0
for i in r:
acumulado[contador] = i
demeaned_r = acumulado - acumulado.mean()
# use the population standard deviation, so set dof=0
sigma_r = acumulado.std(ddof=0)
exp = (demeaned_r**3).mean()
result.append(exp/sigma_r**3)
contador += 1
return result
使用示例
df1['SK'] = skewness(df1['Col_name'])
df1.head()
.expanding()
+ .apply()
:
def skew(acumulado):
demeaned_r = acumulado - acumulado.mean()
# use the population standard deviation, so set dof=0
sigma_r = acumulado.std(ddof=0)
exp = (demeaned_r**3).mean()
return exp / sigma_r**3
df1["SK_new"] = df1["Col_name"].expanding().apply(skew)
print(df1)
打印:
Col_name SK SK_new
0 1 NaN NaN
1 2 0.000000 0.000000
2 3 0.000000 0.000000
3 0 0.000000 0.000000
4 1 0.271545 0.271545
5 2 0.000000 0.000000
6 3 -0.192012 -0.192012