如何计算每种产品的滚动平均值?

问题描述 投票:1回答:1

我在pandas的数据框中有前三列。我想计算每个产品的3天移动平均值,如第4栏所示。

数据

print (df)
       Date     Product  Demand  mov Avg
0  1-Jan-19  Product-01       3      NaN
1  2-Jan-19  Product-01       4      NaN
2  3-Jan-19  Product-01       5      4.0
3  4-Jan-19  Product-01       6      5.0
4  5-Jan-19  Product-01       7      6.0
5  3-Jan-19  Product-02       2      NaN
6  4-Jan-19  Product-02       3      NaN
7  5-Jan-19  Product-02       4      3.0
8  6-Jan-19  Product-02       5      4.0
9  7-Jan-19  Product-02       8      5.7

我尝试使用groupby和滚动平均值,但似乎没有用。

df['mov_avg'] =df.set_index('Date').groupby('Product').rolling('Demand',window=7).mean().reset_index(drop=True)
python pandas pandas-groupby
1个回答
1
投票

使用:

df['Date'] = pd.to_datetime(df['Date'], format='%d-%b-%y')

您的解决方案应该由rolling(3, freq='d')更改:

#sorting if not sorted DataFrame by both columns
df = df.sort_values(['Date','Product']).reset_index(drop=True)

df['mov_avg'] = (df.set_index('Date')
                   .groupby('Product')['Demand']
                   .rolling(3, freq='d')
                   .mean()
                   .reset_index(drop=True))

另一个更好的解决方案是使用DataFrame.join

s = df.set_index('Date').groupby('Product')['Demand'].rolling(3, freq='d').mean()
df = df.join(s.rename('mov_avg'), on=['Product','Date'])

print (df)
        Date     Product  Demand  mov Avg   mov_avg
0 2019-01-01  Product-01       3      NaN       NaN
1 2019-01-02  Product-01       4      NaN       NaN
2 2019-01-03  Product-01       5      4.0  4.000000
3 2019-01-04  Product-01       6      5.0  5.000000
4 2019-01-05  Product-01       7      6.0  6.000000
5 2019-01-03  Product-02       2      NaN       NaN
6 2019-01-04  Product-02       3      NaN       NaN
7 2019-01-05  Product-02       4      3.0  3.000000
8 2019-01-06  Product-02       5      4.0  4.000000
9 2019-01-07  Product-02       8      5.7  5.666667
© www.soinside.com 2019 - 2024. All rights reserved.