如何使用 Pandas 高效计算投资组合收益?

问题描述 投票:0回答:0

我有不同个人的股票交易数据,想计算每个人的每日收益。

Individual Stock    Date        Trade
1          AAPL     2022-01-01  +1
1          AAPL     2022-01-02  +2
2          GOOG     2022-01-01  +10
2          GOOG     2022-01-02  -5

结合股价数据:

Stock Date         Price
AAPL  2022-01-01   102
AAPL  2022-01-02   98
AAPL  2022-01-03   96
GOOG  2022-01-01   31
GOOG  2022-01-02   35
GOOG  2022-01-03   40

目前,我循环遍历每个日期并计算昨天头寸的损益,同时使用当天的交易更新当前头寸(以下 DataFrame 适用于 2022-01-02):

Individual Stock Pos_EOD  Pos_Today Price_EOD   Price_Today PnL
1          AAPL  1        3         102         98          -4
2          GOOG  10       5         31          35          40

为每个日期计算上述 DataFrame 后的最终组合 DataFrame 为:

Individual Date    PnL
1          2022-02 -4
1          2022-03 -6
2          2022-02 40
2          2022-04 25

但是由于个人数量多(+100,000),股票多(+50,000),性能比较慢。直觉上,我觉得可以将整个操作“矢量化”,但我不确定如何去做。任何建议都将非常受欢迎。请参阅下面的代码进行重现。

import pandas as pd

trades = pd.DataFrame( data = [ [1, "AAPL", "2022-01-01", 1],
                        [1, "AAPL", "2022-01-02", 2],
                        [2, "GOOG", "2022-01-01", 10],
                        [2, "GOOG", "2022-01-02", -5]],
                      columns = ["Individual","Stock", "Date", "Trade"])

prices = pd.DataFrame( data = [ ["AAPL", "2022-01-01", 102],
                               ["AAPL", "2022-01-02", 98],
                               ["AAPL", "2022-01-03", 96],
                               ["GOOG", "2022-01-01", 31],
                               ["GOOG", "2022-01-02", 35],
                               ["GOOG", "2022-01-03", 40]],
                               columns = ["Stock", "Date", "Price"])
                  
dates = pd.unique(prices["Date"]) #define all days with prices

start_pos = trades[trades["Date"] == dates[0]] #initial positions
start_pos = pd.merge(start_pos, prices[prices["Date"]==dates[0]].drop(columns = "Date")) #merge with prices on the first day
start_pos = start_pos.rename(columns = {"Price": "Price_EOD", "Trade": "Pos_EOD"}) #rename columns to allow for merge again later
  
daily_pnl_dict = {}

for i in dates[1:]:

    start_pos = pd.merge(start_pos, trades[trades["Date"]==i].drop(columns = "Date"), on = ["Individual", "Stock"], how = "outer") #merge with new trades
    
    start_pos = pd.merge(start_pos, prices[prices["Date"]==i].drop(columns = "Date"), on = "Stock", how = "left") #merge with new prices

    start_pos["PnL"] = (start_pos["Price"] - start_pos["Price_EOD"]) * start_pos["Pos_EOD"] #calculate PnL as position x price change
    
    daily_pnl = start_pos.groupby("Individual")["pnl"].sum() #calculate pnl for each individual

    daily_pnl_dict[i] = daily_pnl #save daily pnl

    start_pos["Price_EOD"] = start_pos["Price"] #prepare for new loop, set today's price as yesterday's price
        
    start_pos["Pos_EOD"] = start_pos["Pos_EOD"] + start_pos["Trade"] #update position with today's trade

    start_pos = start_pos.drop(columns = ["Price", "Trade"]) #make place for new trades and prices in next iteration
pandas bigdata portfolio
© www.soinside.com 2019 - 2024. All rights reserved.