如何优化这个功能?

问题描述 投票:0回答:1

我想获取当前值之前的所有值并计算它们的偏度,因此最后我可以得到一个系列,其中每行都有最后一个值的累积偏度。 请注意,r 系列也包含零

def skewness_line(r):

    """
    Computes the skewness for each row of the supplied DataFrame
    Returns a Series with the skewness for each row cummulative
    """
    acumulado = pd.Series()
    result = []
    contador = 0
    for i in r: 
        acumulado[contador] = i
        demeaned_r = acumulado - acumulado.mean()
        # use the population standard deviation, so set dof=0
        sigma_r = acumulado.std(ddof=0)
        exp = (demeaned_r**3).mean()
        result.append(exp/sigma_r**3)
        contador += 1
        
    return result

使用示例

df1['SK'] = skewness(df1['Col_name'])
df1.head()
python pandas numpy data-science
1个回答
0
投票

您可以使用

.expanding()
+
.apply()
:

def skew(acumulado):
    demeaned_r = acumulado - acumulado.mean()
    # use the population standard deviation, so set dof=0
    sigma_r = acumulado.std(ddof=0)
    exp = (demeaned_r**3).mean()
    return exp / sigma_r**3


df1["SK_new"] = df1["Col_name"].expanding().apply(skew)
print(df1)

打印:

   Col_name        SK    SK_new
0         1       NaN       NaN
1         2  0.000000  0.000000
2         3  0.000000  0.000000
3         0  0.000000  0.000000
4         1  0.271545  0.271545
5         2  0.000000  0.000000
6         3 -0.192012 -0.192012
© www.soinside.com 2019 - 2024. All rights reserved.