忽略 Plotly 烛台中的“分钟/小时”数据中的非交易日(假期/删除间隙)

问题描述 投票:0回答:2

这个答案说

xaxis=dict(type = "category")
,但我不知道在哪里使用该参数(来自
matplotlib
,仅适用于烛台)

按照一些链接,我发现在

Day
数据上运行以下代码,我可以轻松消除间隙:


 dt_all = pd.date_range(start=stocks.iloc[0,0],end=stocks.iloc[-1,0], freq = f'{freq}min')
            dt_obs = [d.strftime("%Y-%m-%d %H:%M:%S") for d in pd.to_datetime(stocks.DATE)]
            dt_breaks = [d for d in dt_all.strftime("%Y-%m-%d %H:%M:%S").tolist() if not d in dt_obs]
            
            range_selector = dict(buttons = list([
                    dict(count = 5, label = '5Min', step = 'minute', stepmode = 'backward'),
                    dict(count = 15, label = '15Min', step = 'minute', stepmode = 'backward'),
                    dict(count = 75, label = '75M', step = 'minute', stepmode = 'backward'),
                    dict(count = 1, label = '1D', step = 'day', stepmode = 'backward'),
                    dict(step = 'all')]))

candle = go.Figure(data = [go.Candlestick(opacity = 0.9,x = stocks['Date'], name = 'X',
                                                       open = stocks['Open'], 
                                                       high = stocks['High'], 
                                                       low = stocks['Low'], 
                                                       close = stocks['Close']),])

candle.update_xaxes(
            title_text = 'Date',
            rangeslider_visible = True, 
        rangebreaks=[dict(values=dt_breaks)], 
range_selector = range_selector)

但我有 5 分钟数据:

    DATE    OPEN    HIGH    LOW CLOSE   52W H   52W L   SYMBOL
374 2022-01-14 15:25:00+05:30   720.25  722.35  720.25  721.55  NaN NaN BHARTIARTL
373 2022-01-14 15:20:00+05:30   720.30  720.45  719.45  720.25  NaN NaN BHARTIARTL
372 2022-01-14 15:15:00+05:30   720.75  720.90  720.15  720.30  NaN NaN BHARTIARTL
371 2022-01-14 15:10:00+05:30   720.35  720.90  720.20  720.70  NaN NaN BHARTIARTL
370 2022-01-14 15:05:00+05:30   720.70  720.90  720.05  720.20  NaN NaN BHARTIARTL
... ... ... ... ... ... ... ... ...
4   2022-01-10 09:35:00+05:30   706.05  707.15  705.65  706.55  NaN NaN BHARTIARTL
3   2022-01-10 09:30:00+05:30   705.90  706.40  705.05  706.05  NaN NaN BHARTIARTL
2   2022-01-10 09:25:00+05:30   707.10  707.95  705.60  705.60  NaN NaN BHARTIARTL
1   2022-01-10 09:20:00+05:30   709.00  709.40  706.15  707.10  NaN NaN BHARTIARTL
0   2022-01-10 09:15:00+05:30   705.40  709.00  705.40  708.55  NaN NaN BHARTIARTL

使用上面的代码给我的结果是:

这种情况可以做什么?

python plotly data-visualization plotly-python
2个回答
4
投票

Plotly:如何从 x 轴删除空日期还有其他答案,更适合您的用例。对于 5 分钟的间隔来说,这可能有点棘手。只需考虑时间戳的格式,并仔细执行以下步骤:

  • 找到从第一次观察到最后一次观察的所有时间间隔
  • 找出您的哪些观察结果发生在完整的时间轴中
  • 隔离其余日期并将它们包含在 x 轴的
    rangebreaks
    属性中
  • 调整
    dvalues
    rangebreaks
    属性以匹配 以毫秒为单位的时间间隔
    fig.update_xaxes(rangebreaks=[dict(dvalue = 5*60*1000, values=dt_breaks)])

基本代码元素:

# grab first and last observations from df.date and make a continuous date range from that
dt_all = pd.date_range(start=df['Date'].iloc[0],end=df['Date'].iloc[-1], freq = '5min')

# check which dates from your source that also accur in the continuous date range
dt_obs = [d.strftime("%Y-%m-%d %H:%M:%S") for d in df['Date']]

# isolate missing timestamps
dt_breaks = [d for d in dt_all.strftime("%Y-%m-%d %H:%M:%S").tolist() if not d in dt_obs]

# adjust xaxis for rangebreaks
fig.update_xaxes(rangebreaks=[dict(dvalue = 5*60*1000, values=dt_breaks)])

图 1:显示缺失的时间戳

图 2:未显示缺失的时间戳

完整代码:

import plotly.graph_objects as go
from plotly.subplots import make_subplots
import pandas as pd
import numpy as np

# sample data
df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/finance-charts-apple.csv').tail(90)
df = df[df.columns[:6]]
df['Date'] = pd.date_range("2018-01-01", periods=len(df), freq="5min")
df.columns = ['Date', 'Open', 'High', 'Low', 'Close', 'Volume']
df = df.tail(10)

# remove some data
np.random.seed(0)
remove_n = 4
drop_indices = np.random.choice(df.index, remove_n, replace=False)
df = df.drop(drop_indices)

# plotly candlestick figure
fig = go.Figure(data=[go.Candlestick(
    x=df['Date'],
    open=df['Open'], high=df['High'],
    low=df['Low'], close=df['Close'],
)])

# grab first and last observations from df.date and make a continuous date range from that
dt_all = pd.date_range(start=df['Date'].iloc[0],end=df['Date'].iloc[-1], freq = '5min')

# check which dates from your source that also accur in the continuous date range
dt_obs = [d.strftime("%Y-%m-%d %H:%M:%S") for d in df['Date']]

# isolate missing timestamps
dt_breaks = [d for d in dt_all.strftime("%Y-%m-%d %H:%M:%S").tolist() if not d in dt_obs]
dt_breaks = pd.to_datetime(dt_breaks)

fig.show()
fig.update_xaxes(rangebreaks=[dict(dvalue = 5*60*1000, values=dt_breaks)] )
print(fig.layout.xaxis.rangebreaks)
fig.show()

0
投票
 df1 = df.resample('h').agg({
                'close': 'last',
                'high': 'max',
                'open': 'first',
                'low': 'min',
                'volume': 'sum'
            }).reset_index()
            filtered_gap_time = df1[df1.isna().any(axis=1)]
            df1 = df1.dropna()
  dt_breaks = [d.strftime("%Y-%m-%d %H:%M:%S") for d in filtered_gap_time['datetime']]


        # adjust xaxis for rangebreaks
        fig.update_xaxes(rangebreaks=[dict(bounds=["thu", "fri"]),dict(dvalue=60*60*1000, values=dt_breaks)])
  • 注意:此代码设置的时间范围为 60 分钟或 1 小时 ->
    dvalue=60*60*1000
    , 您可以更改任何时间范围
© www.soinside.com 2019 - 2024. All rights reserved.