如何用线连接缺少值的数据点

问题描述 投票:0回答:1

我需要在一张图表上按日期绘制多个生物标志物变化,但生物标志物样本是在不同日期和不同时间测量的,因此例如:

data = {
    'PatientID': [244651, 244651, 244651, 244651, 244652, 244653, 244651],
    'LocationType': ['IP', 'IP', 'OP', 'IP', 'IP', 'OP', 'IP'],
    'Date': ['2023-01-01', '2023-01-02', '2023-01-03', '2023-01-04', '2023-01-01', '2023-01-01', '2023-01-05'],
    'Biomarker1': [1.1, 1.2, None, 1.4, 2.1, 3.1, 1.5],
    'Biomarker2': [2.1, None, 2.3, 2.4, 3.1, 4.1, 2.5],
    'Biomarker3': [3.1, 3.2, 3.3, None, 4.1, 5.1, 3.5]
}

绘制图表:

# Set the date as the index
filtered_df.set_index('Date', inplace=True)

# Plot all biomarkers
plt.figure(figsize=(12, 8))

# Loop through each biomarker column to plot
for column in filtered_df.columns:
    if column not in ['PatientID', 'LocationType']:
        plt.plot(filtered_df.index, filtered_df[column], marker='o', linestyle='-', label=column)

这是我的输出: 生物标志物随时间变化

我需要一个生物标记的所有点都用线连接起来。我不能使用插值,这些点应该只用线连接。

我该怎么做? 请帮忙!

我尝试插值,但它创建了新点,我不需要新点。

python pandas dataframe date plot
1个回答
0
投票

您可以使用插值来完成线条,然后将非插值数据添加到顶部,如下所示:

from matplotlib import pyplot as plt
# For color matching.
from matplotlib.colors import TABLEAU_COLORS
import pandas as pd

# Set-up.
data = {
    'PatientID': [244651, 244651, 244651, 244651, 244652, 244653, 244651],
    'LocationType': ['IP', 'IP', 'OP', 'IP', 'IP', 'OP', 'IP'],
    'Date': ['2023-01-01', '2023-01-02', '2023-01-03', '2023-01-04', '2023-01-01', '2023-01-01', '2023-01-05'],
    'Biomarker1': [1.1, 1.2, None, 1.4, 2.1, 3.1, 1.5],
    'Biomarker2': [2.1, None, 2.3, 2.4, 3.1, 4.1, 2.5],
    'Biomarker3': [3.1, 3.2, 3.3, None, 4.1, 5.1, 3.5]
}

df = pd.DataFrame(data)
df = df.set_index("Date")

# To match your filtered data.
filtered_df = df.loc[df.PatientID.eq(244651)]
# Limit columns to plot.
cols_to_plot = df.loc(axis="columns")["Biomarker1":].columns
# Interpolate to fill missing values (used to plot lines, not markers).
interpolated_df = filtered_df[cols_to_plot].interpolate()

# Plot everything.
fig, ax = plt.subplots()
# Plot interpolated lines.
interpolated_df.plot.line(
    ax=ax,
    # Turn of `legend` to avoid duplication.
    legend=False,
)
# Plot non-interpolated points.
filtered_df[cols_to_plot].plot(
    ax=ax,
    linestyle="-",
    marker="o",
    # Use `TABLEAU_COLORS` to use same colors.
    color=TABLEAU_COLORS.values(),
)

plot

© www.soinside.com 2019 - 2024. All rights reserved.