这个问题是如何读取 .txt 文件来绘制图表的后续问题。
我有一个包含以下格式的时间序列数据的文件:
00:01:28,102,103,103 20-03-2024
00:02:16,111,110,110
00:02:33,108,109,109
00:02:49,107,108,108
24 hours read....
23:58:54,111,112,112
23:59:11,109,110,110
23:59:47,115,116,117
00:00:04,115,116,116 21-03-2024
00:00:20,121,122,120
00:00:36,124,125,125
24 hours read...
23:59:02,115,115,116
23:59:19,114,114,114
23:59:51,113,114,115
00:00:07,113,114,115 22-03-2024
00:00:24,116,117,115
00:00:45,115,115,116
24 hours read
23:59:08,101,101,100
23:59:32,103,103,102
23:59:48,102,102,102
Next day
每行包含一个时间戳、三个数字读数,有时还包含一个指示新一天开始的日期。我正在尝试使用 pandas 和 matplotlib 绘制这些数据,但遇到两个主要问题:x 轴标签(小时)重叠和绘图加载缓慢。
这是我目前的绘图方法:
plt.figure(figsize=(15,9))
plt.xlabel('Day')
plt.ylabel('Voltage')
# Plot three series from the data
plt.plot(C0Temp, C1Temp, label="Voltage", color=LineColorTemp1Text)
plt.plot(C2Temp, C3Temp, label="Max", color='r')
plt.plot(C4Temp, C5Temp, label="Min", color='g')
plt.legend()
# Attempt to format x-axis to handle daily data
locator = mdates.AutoDateLocator(minticks=12, maxticks=24)
plt.gcf().axes[0].xaxis.set_major_locator(locator)
plt.xticks(rotation=45)
我正在寻找有关如何有效地逐日甚至跨月绘制这些数据的指导,确保 x 轴标签可读且绘图有效加载。
由于文本文件格式不统一,需要逐行解析。此方法允许处理数据表示的变化,例如某些行上是否存在日期以及包含非数据行(例如,“24 小时阅读...”和“第二天”)。通过读取每一行,脚本区分数据条目和元数据或注释,确保只处理相关信息。尽管文件最初不规则,但这种方法准备了用于分析和可视化的结构化数据集。
我的建议是标准化测量输出格式。
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pandas as pd
# Initialize variables
timestamps = []
values1 = []
values2 = []
values3 = []
current_date = None
# Implement parsing logic to accurately handle the lines with and without dates
# 00_test.txt is the data from the OP in a text file
with open('00_test.txt', "r") as file:
for line in file:
line = line.strip()
if not line or "hours read" in line or "Next day" in line:
continue # Skip non-data lines
parts = line.split(',')
if len(parts) == 4 and parts[-1].count('-') == 2: # Checking if the last part is a date
# Extract date from the last part
time, val1, val2, val3, date = parts[0], parts[1], parts[2], parts[3].split(' ')[0], parts[3].split(' ')[1]
current_date = pd.to_datetime(date, format="%d-%m-%Y")
else:
# Process data lines without a date
time, val1, val2, val3 = parts[0], parts[1], parts[2], parts[3]
if current_date: # Ensure a date has been set
datetime_str = f"{current_date.date()} {time}"
datetime_obj = pd.to_datetime(datetime_str, format="%Y-%m-%d %H:%M:%S")
timestamps.append(datetime_obj)
values1.append(float(val1))
values2.append(float(val2))
values3.append(float(val3))
# Ensure the DataFrame is created outside the loop
df = pd.DataFrame({'DateTime': timestamps, 'Value1': values1, 'Value2': values2, 'Value3': values3})
df.set_index('DateTime', inplace=True)
该图显示一个数据框,其中每个数据点都有标记,设置为特定大小并标记轴。 x 轴上的主要刻度线以“Y-m-d”格式显示日期,次要刻度线表示指定范围内每 4 小时一次的时间。主要刻度标签旋转 90 度并居中,而次刻度标签保持水平并居中。该图以主要间隔和次要间隔的网格线为特色,其样式不同以区分日期和时间。为了清晰起见,布局进行了调整,适应旋转标签以获得更好的可见性。
许多问题已经解决了使用 pandas DataFrames 进行绘图以及格式化 pandas DataFrame 的日期时间 x 轴的问题。我鼓励您探索这些资源并根据您的要求调整情节。如需进一步的绘图查询或具体调整,请考虑发布一个新问题并参考现有讨论。
# Plot the DataFrame directly
ax = df.plot(marker='.', figsize=(15, 9), xlabel='Time', ylabel='Voltage')
# Setting the major ticks to display the date in 'Y-m-d' format
ax.xaxis.set_major_locator(mdates.DayLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
# Setting the minor ticks to display the time
ax.xaxis.set_minor_locator(mdates.HourLocator(byhour=range(4, 21, 4))) # Adjust the interval as needed
ax.xaxis.set_minor_formatter(mdates.DateFormatter('%H:%M'))
# Enhance the display for readability
plt.setp(ax.xaxis.get_majorticklabels(), rotation=90, ha="center") # Rotate major ticks for better visibility
plt.setp(ax.xaxis.get_minorticklabels(), rotation=0, ha="center") # Rotate and right-align minor ticks
ax.xaxis.grid(True, which='major', linestyle='-', linewidth='0.5', color='black') # Major grid lines
ax.xaxis.grid(True, which='minor', linestyle=':', linewidth='0.5', color='gray') # Minor grid lines
plt.tight_layout() # Adjust layout to make room for tick labels
plt.show()
df
Value1 Value2 Value3
DateTime
2024-03-20 00:02:16 111.0 110.0 110.0
2024-03-20 00:02:33 108.0 109.0 109.0
2024-03-20 00:02:49 107.0 108.0 108.0
2024-03-20 23:58:54 111.0 112.0 112.0
2024-03-20 23:59:11 109.0 110.0 110.0
2024-03-20 23:59:47 115.0 116.0 117.0
2024-03-21 00:00:20 121.0 122.0 120.0
2024-03-21 00:00:36 124.0 125.0 125.0
2024-03-21 23:59:02 115.0 115.0 116.0
2024-03-21 23:59:19 114.0 114.0 114.0
2024-03-21 23:59:51 113.0 114.0 115.0
2024-03-22 00:00:24 116.0 117.0 115.0
2024-03-22 00:00:45 115.0 115.0 116.0
2024-03-22 23:59:08 101.0 101.0 100.0
2024-03-22 23:59:32 103.0 103.0 102.0
2024-03-22 23:59:48 102.0 102.0 102.0