在 24 小时时间线中创建时间事件分布,颜色较深,以缩短事件间隔时间

问题描述 投票:0回答:1

我有一个包含两列的活动数据集:

 $ respondent_id : chr [1:20836241] "1086624" "1086624" "1086624" "1086624" ...
 $ fulldate: POSIXct[1:20836241], format: "2023-05-25 05:45:40" "2023-05-22 19:42:44" ...

如何计算 24 小时时间线上的活动分布,如下图所示?

activity distribution examples(https://i.stack.imgur.com/igEGi.png) (参考:Karsai, M.、Jo, H. H. 和 Kaski, K. (2018) 中的第 2 页。突发人类动力学。Cham:Springer International Publishing)

我可以使用 R 或 python 来生成图表。

我尝试在Python中使用示例数据,但并没有真正解决问题:


import matplotlib.pyplot as plt
from matplotlib.dates import DayLocator, HourLocator, date2num, num2date
import datetime  # Import the datetime module

# Sample data (replace with your actual call log data)
call_times = [
    "2023-11-19 08:00:00",
    "2023-11-19 08:10:00",
    "2023-11-19 08:30:00",
    "2023-11-19 09:00:00",
    "2023-11-20 10:00:00",
    "2023-11-20 11:00:00",
]

# Convert call times into date objects
dates = [datetime.datetime.strptime(t, "%Y-%m-%d %H:%M:%S") for t in call_times]

# Calculate the time difference between consecutive calls
time_deltas = [abs(dates[i] - dates[i-1]).total_seconds() for i in range(1, len(dates))]

# Assign darkness values based on time difference (heuristic)
darkness = [min(td / 3600, 1) for td in time_deltas]  # Normalize to 0-1

# Plot the data with darkness representing call frequency
plt.figure(figsize=(10, 6))
days = date2num(dates)
plt.plot(days, darkness, marker='o', linestyle='-')

# Format the x-axis for day and hour labels
plt.gca().xaxis.set_major_locator(DayLocator())
plt.gca().xaxis.set_major_formatter(DateFormatter("%d"))
plt.gca().xaxis.set_minor_locator(HourLocator(span=24))
plt.gca().xaxis.set_minor_formatter(DateFormatter("%H"))

# Set labels and title
plt.xlabel("Date & Time")
plt.ylabel("Call Frequency (Darker = More Frequent)")
plt.title("Outgoing Mobile Call Sequence")

# Rotate x-axis labels for readability
plt.xticks(rotation=45)
plt.grid(True)
plt.tight_layout()
plt.show()
Traceback (most recent call last):
  File "/home/doreena/venvs/dd/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3553, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-3-107fc1c9c959>", line 27, in <module>
    plt.plot(days, darkness, marker='o', linestyle='-')
  File "/home/doreena/venvs/dd/lib/python3.10/site-packages/matplotlib/pyplot.py", line 3590, in plot
    return gca().plot(
  File "/home/doreena/venvs/dd/lib/python3.10/site-packages/matplotlib/axes/_axes.py", line 1724, in plot
    lines = [*self._get_lines(self, *args, data=data, **kwargs)]
  File "/home/doreena/venvs/dd/lib/python3.10/site-packages/matplotlib/axes/_base.py", line 303, in __call__
    yield from self._plot_args(
  File "/home/doreena/venvs/dd/lib/python3.10/site-packages/matplotlib/axes/_base.py", line 499, in _plot_args
    raise ValueError(f"x and y must have same first dimension, but "
ValueError: x and y must have same first dimension, but have shapes (6,) and (5,)
visualization timeserieschart
1个回答
0
投票

您需要在事件之间填充代表两个事件相距多远的颜色。下面的例子。

测试数据:

call_times = [
    "2023-11-19 08:00:00",
    "2023-11-19 08:10:00",
    "2023-11-19 08:30:00",
    "2023-11-19 09:00:00",
    "2023-11-19 09:20:00",
    "2023-11-19 09:50:00",
    "2023-11-19 11:50:00",
    "2023-11-19 14:50:00",
    "2023-11-19 19:50:00",
    "2023-11-19 22:50:00",
    "2023-11-20 00:50:00",
    "2023-11-20 02:20:00",
    "2023-11-20 05:50:00",
    "2023-11-20 06:30:00",
    "2023-11-20 08:00:00",
    "2023-11-20 10:00:00",
    "2023-11-20 10:50:00",
    "2023-11-20 11:00:00",
    "2023-11-20 12:00:00",
    "2023-11-20 13:05:00",
    "2023-11-20 14:10:00",
    "2023-11-20 16:00:00",
    "2023-11-20 19:20:00",
    "2023-11-20 22:30:00",
    "2023-11-20 23:00:00",
]

代码:

import matplotlib.pyplot as plt
from matplotlib import dates as mdates
from datetime import datetime

#Assumption is made that data is ordered sequentially. Data should be sorted.

#Strings to datetime objects
dates = [datetime.strptime(call_time, '%Y-%m-%d %H:%M:%S') for call_time in call_times]

#Time delta objects, then converted to seconds
deltas = [dates[i + 1] - dates[i] for i in range(len(dates) - 1)]
second_deltas = [delta.total_seconds() for delta in deltas]

#
# Plot
#
from matplotlib import colors as mcolors

#Normalize() will map the min/max values to 0/1
norm = mcolors.Normalize(vmin=min(second_deltas), vmax=max(second_deltas))
cmap = plt.get_cmap('Reds_r')

#Create plot
f, ax = plt.subplots(figsize=(10, 1))

#Fill between the locations
[ax.fill_between(x=[current_date, next_date], y1=0, y2=1, color=cmap(norm(delta)))
 for current_date, next_date, delta in zip(dates[:-1], dates[1:], second_deltas)]

#Optional: mark the locations of the data
[ax.axvline(x=date, linewidth=2.5, ymax=0.1, color='tab:green') for date in dates]

#Optional: colour bar
from matplotlib.cm import ScalarMappable
mappable = ScalarMappable(norm=norm, cmap=cmap.name)

ax_pos = ax.get_position()
cax = f.add_axes([
    ax_pos.x0 + ax_pos.width, ax_pos.y0, ax_pos.width/30, ax_pos.height
])
f.colorbar(mappable=mappable, cax=cax, label='delta/s')

#Formatting
ax.set_ylim(0, 1)
ax.xaxis.set_major_formatter(
    mdates.ConciseDateFormatter(locator=mdates.HourLocator(interval=3))
)
ax.tick_params(axis='x', rotation=60, bottom=False)
ax.yaxis.set_ticks([])
ax.spines[:].set_visible(False)

# Axis labels and title
ax.set_title("Outgoing Mobile Call Sequence")
# plt.xlabel("time", fontsize=9)
# plt.ylabel("Call freq. (darker\nis more frequent)", fontsize=9)
© www.soinside.com 2019 - 2024. All rights reserved.