我有2个DataFrames - 1个包含股票代码和最大最小价格范围以及其他列。
另一个DataFrame有日期作为指数,并按股票代码分组,有各种指标,如开盘,收盘,高低等。现在,我想从这个DataFrame中统计一个给定股票的收盘价高于最低价的天数。
我被卡在这里:现在我想找到例如AMZN有多少天的交易价格低于期间的最高价格。
我想根据第一个数据帧的值,从第二个数据帧中统计天数,统计收盘价小于MaxMin期价的天数。
我已经添加了代码来重现DataFrames。
import pandas as pd
import datetime
from dateutil.relativedelta import relativedelta
import yfinance as yf
start=datetime.datetime.today()-relativedelta(years=2)
end=datetime.datetime.today()
us_stock_list='FB AMZN BABA'
data_metric = yf.download(us_stock_list, start=start, end=end,group_by='column',auto_adjust=True)
data_ticker= yf.download(us_stock_list, start=start, end=end,group_by='ticker',auto_adjust=True)
stock_list=[stock for stock in data_ticker.stack()]
# max_price
max_values=pd.DataFrame(data_ticker.max().unstack()['High'])
# min_price
min_values=pd.DataFrame(data_ticker.min().unstack()['Low'])
# latest_price
latest_day=pd.DataFrame(data_ticker.tail(1).unstack())
latest_day=latest_day.unstack().unstack().unstack().reset_index()
# latest_day=latest_day.unstack().reset_index()
latest_day=latest_day.drop(columns=['level_0','Date'])
latest_day.set_index('level_3',inplace=True)
latest_day.rename(columns={0:'Values'},inplace=True)
latest_day=latest_day.groupby(by=['level_3','level_2']).max().unstack()
latest_day.columns=[ '_'.join(x) for x in latest_day.columns ]
latest_day=latest_day.join(max_values,how='inner')
latest_day=latest_day.join(min_values,how='inner')
latest_day.rename(columns={'High':'Period_High_Max','Low':'Period_Low_Min'},inplace=True)
close_price_data=pd.DataFrame(data_metric['Close'].unstack().reset_index())
close_price_data= close_price_data.rename(columns={'level_0':'Stock',0:'Close_price'})
close_price_data.set_index('Stock',inplace=True)
使用这个来重现。
{"Values_Close":{"AMZN":2286.0400390625,"BABA":194.4799957275,"FB":202.2700042725},"Values_High":{"AMZN":2362.4399414062,"BABA":197.3800048828,"FB":207.2799987793},"Values_Low":{"AMZN":2258.1899414062,"BABA":192.8600006104,"FB":199.0500030518},"Values_Open":{"AMZN":2336.8000488281,"BABA":195.75,"FB":201.6000061035},"Values_Volume":{"AMZN":9754900.0,"BABA":22268800.0,"FB":30399600.0},"Period_High_Max":{"AMZN":2475.0,"BABA":231.1399993896,"FB":224.1999969482},"Period_Low_Min":{"AMZN":1307.0,"BABA":129.7700042725,"FB":123.0199966431},"%_Position":{"AMZN":0.8382192115,"BABA":0.6383544892,"FB":0.7832576338}}
{"Stock":{
"0":"AMZN",
"1":"AMZN",
"2":"AMZN",
"3":"AMZN",
"4":"AMZN",
"5":"AMZN",
"6":"AMZN",
"7":"AMZN",
"8":"AMZN",
"9":"AMZN",
"10":"AMZN",
"11":"AMZN",
"12":"AMZN",
"13":"AMZN",
"14":"AMZN",
"15":"AMZN",
"16":"AMZN",
"17":"AMZN",
"18":"AMZN",
"19":"AMZN"},
"Date":{
"0":1525305600000,
"1":1525392000000,
"2":1525651200000,
"3":1525737600000,
"4":1525824000000,
"5":1525910400000,
"6":1525996800000,
"7":1526256000000,
"8":1526342400000,
"9":1526428800000,
"10":1526515200000,
"11":1526601600000,
"12":1526860800000,
"13":1526947200000,
"14":1527033600000,
"15":1527120000000,
"16":1527206400000,
"17":1527552000000,
"18":1527638400000,
"19":1527724800000 },
"Close_price":{
"0":1572.0799560547,
"1":1580.9499511719,
"2":1600.1400146484,
"3":1592.3900146484,
"4":1608.0,
"5":1609.0799560547,
"6":1602.9100341797,
"7":1601.5400390625,
"8":1576.1199951172,
"9":1587.2800292969,
"10":1581.7600097656,
"11":1574.3699951172,
"12":1585.4599609375,
"13":1581.4000244141,
"14":1601.8599853516,
"15":1603.0699462891,
"16":1610.1500244141,
"17":1612.8699951172,
"18":1624.8900146484,
"19":1629.6199951172}}
做一个 merge
两个数据帧之间。groupby
公司 level=0
)和 apply
一个自定义函数。
df_merge = close_price_data.merge(
latest_day[['Period_High_Max', 'Period_Low_Min']],
left_index=True,
right_index=True)
def fun(df):
d = {}
d['days_above_min'] = (df.Close_price > df.Period_Low_Min).sum()
d['days_below_max'] = (df.Close_price < df.Period_High_Max).sum()
return pd.Series(d)
df_merge.groupby(level=0).apply(fun)
Period_Low_Min
和 Period_High_Max
分别是最小值和最大值,所以所有的收盘价都会在这个范围内,如果这不是你想达到的目的,请告诉我。