我有以下数据:
Date, Symbol, Close
2021-01-01, AAPL, 10
2021-01-02, AAPL, 12
2021-01-03, AAPL, 3
2021-01-01, MSFT, 12
2021-01-02, MSFT, 20
2021-01-03, MSFT, 12
2021-01-01, MSFT, 1
2021-01-02, MSFT, 2
2021-01-03, MSFT, 3
我尝试了以下方法,但得到 NaN
df['MSFT_MAX_CLOSE'] = df.loc[df['Symbol']=='MSFT'].groupby(['Date'])['Close'].transform(max)
预期产量
Date, Symbol, Close, MSFT_MAX_CLOSE
2021-01-01, AAPL, 10, 12
2021-01-02, AAPL, 12, 20
2021-01-03, AAPL, 3, 12
2021-01-01, MSFT, 12, 12
2021-01-02, MSFT, 20, 20
2021-01-03, MSFT, 12, 12
2021-01-01, MSFT, 1, 12
2021-01-02, MSFT, 2, 20
2021-01-03, MSFT, 3, 12
有什么想法吗?
transform
不起作用。您可以先使用 groupby.max
,然后使用 map
聚合每个日期的值:
msft_max = (df.loc[df['Symbol'].eq('MSFT')]
.groupby('Date')['Close'].max()
)
df['MSFT_MAX_CLOSE'] = df['Date'].map(msft_max)
输出:
Date Symbol Close MSFT_MAX_CLOSE
0 2021-01-01 AAPL 10 12
1 2021-01-02 AAPL 12 20
2 2021-01-03 AAPL 3 12
3 2021-01-01 MSFT 12 12
4 2021-01-02 MSFT 20 20
5 2021-01-03 MSFT 12 12
6 2021-01-01 MSFT 1 12
7 2021-01-02 MSFT 2 20
8 2021-01-03 MSFT 3 12