如何从 scikit-survival 中的 CoxPHSurvivalAnalysis 获取概率密度函数?

问题描述 投票:0回答:1

我正在使用 sksurv.linear_model.CoxPHSurvivalAnalysis 来拟合 cox ph 回归,我想恢复密度函数 f(t)。 sksurv 类具有预测生存函数和累积分布函数 S(t) = 1-F(t) 和累积风险函数 $H(t)$ 的方法,但它似乎没有产生密度函数。

我的用例没有审查,所以这是一个例子:

import pandas as pd
import numpy as np
from sksurv.linear_model import CoxPHSurvivalAnalysis

data = np.random.randint(5,30,size=10)
X_train = pd.DataFrame(data, columns=['covariate'])

y_train = np.array(np.random.randint(0,100,size=10)/100,dtype=[('status',bool),('target',float)])

estimator = CoxPHSurvivalAnalysis()
estimator.fit(X_train,y_train)

X_test = pd.DataFrame({'covariate':[12,2]})
chf = estimator.predict_cumulative_hazard_function(X_test)
cdf = estimator.predict_survival_function(X_test)

fig, ax = plt.subplots(1,2)
for fn_h, fn_c in zip(chf, cdf):
    ax[0].step(fn_h.x,fn_h(fn_h.x),where='post')
    ax[1].step(fn_c.x,fn_c(fn_c.x),where='post')

ax[0].set_title('Cumulative Hazard Functions')
ax[1].set_title('Survival Functions')
plt.show()


我怎样才能访问并绘制密度函数?

python pandas numpy survival-analysis scikit-survival
1个回答
0
投票

概率密度函数(PDF)可以从累积分布函数(CDF)获得:

f(t) = d(F(t)/dt

现在,在生存分析 (SA) 中,PDF

(f(t))
可以用生存函数
S(t)
和风险函数
h(t)
来表示,由下式给出:

f(t) = h(t) x S(t)

其中

S(t) = 1 - F(t)
h(t) = -dS(t)/dt x S(t) = dH(t)/dt

因此,PDF

f(t)
可以表示为:
f(t) = dH(t)/dt x S(t)

现在,为了计算风险函数

f(t)
,我们需要累积风险函数 (CHF)
H(t)
的导数。由于 CHF 都是离散数据点,因此我们需要
InterpolatedUnivariateSpline
库中的
scipy
来区分它。它创建 CHF 的平滑样条插值,然后可以对其进行微分以获得
h(t)
。这是对粘贴的代码的轻微修改:

# Import the necessary libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sksurv.linear_model import CoxPHSurvivalAnalysis
from scipy.interpolate import InterpolatedUnivariateSpline

# Define a function to compute the probability density function (pdf) 
# from the cumulative hazard function (chf) and survival function (sf).
def compute_pdf_from_chf_and_sf(chf, sf):
    # The hazard function is the derivative of the cumulative hazard function.
    # We use InterpolatedUnivariateSpline for spline interpolation to create a smooth 
    # function approximation of the CHF. This provides us with a smooth curve that 
    # passes through each data point, allowing us to differentiate the function and obtain 
    # the hazard function.
    chf_spline = InterpolatedUnivariateSpline(chf.x, chf(chf.x))
    hazard_function = chf_spline.derivative()(chf.x)
    
    # The pdf can be computed using the formula: pdf(t) = hazard(t) * survival(t)
    pdf = hazard_function * sf(chf.x)
    return chf.x, pdf

# Generate random data for demonstration purposes
# Here, we create a random dataset with one covariate and survival times.

np.random.seed(42)  # Setting a fixed seed.
data = np.random.randint(5, 30, size=10)
X_train = pd.DataFrame(data, columns=['covariate'])
y_train = np.array(np.random.randint(0, 100, size=10)/100, dtype=[('status', bool), ('target', float)])

# Initialize and fit the Cox Proportional Hazards model
estimator = CoxPHSurvivalAnalysis()
estimator.fit(X_train, y_train)

# Predict for new data points
X_test = pd.DataFrame({'covariate': [12, 2]})
cumulative_hazard_functions = estimator.predict_cumulative_hazard_function(X_test)
survival_functions = estimator.predict_survival_function(X_test)

# Plot the Cumulative Hazard, Survival, and PDF side by side in a single row
fig, axes = plt.subplots(1, 3, figsize=(15, 5))
for chf, sf in zip(cumulative_hazard_functions, survival_functions):
    # Compute the pdf using our defined function
    times, pdf_values = compute_pdf_from_chf_and_sf(chf, sf)
    
    # Plotting the cumulative hazard function
    axes[0].step(chf.x, chf(chf.x), where='post')
    
    # Plotting the survival function
    axes[1].step(sf.x, sf(sf.x), where='post')
    
    # Plotting the probability density function
    axes[2].step(times, pdf_values, where='post')

# Setting titles for each subplot
axes[0].set_title('Cumulative Hazard Functions')
axes[1].set_title('Survival Functions')
axes[2].set_title('Probability Density Functions')

# Display the plots
plt.tight_layout()
plt.show()


导致

参考文献:用于生存分析的机器学习:调查

© www.soinside.com 2019 - 2024. All rights reserved.