运行KaplanMeier模型的函数(Python)

问题描述 投票:1回答:1

在我的数据集data1中,我有一个列Region,有3个类别:亚洲,欧洲,北美。现在我正在尝试使用KM模型来对属于这3个区域的某些机器部件进行生存分析。使用的变量是机器崩溃前的操作小时数。我使用了以下代码,运行正常:

T=data1['op_hours']
Region_Asia=(data1['Region'] == 'ASIA')
Region_EUROPE=(data1['Region'] == 'EUROPE')
Region_NORTH=(data1['Region'] == 'NORTH AMERICA')
from lifelines import KaplanMeierFitter
kmf = KaplanMeierFitter()
ax = plt.subplot(111)
kmf.fit(T[Region_Asia], label="Asia")
kmf.plot(ax=ax,ci_force_lines=False)
kmf.fit(T[Region_EUROPE], label="Europe")
kmf.plot(ax=ax, ci_force_lines=False)
kmf.fit(T[Region_NORTH], label="North America")
kmf.plot(ax=ax, ci_force_lines=False)
plt.ylim(0, 1);
plt.title("Lifespans of different machines")

我得到以下情节:enter image description here

现在,我正在尝试创建一个函数,这样我就不必为每个类别编写单独的代码行以获得KM拟合。我试过这个:

def Kaplan(c):
    a=[]
    u=[]
    u=c.unique()
    T=data1['op_hours']
    from lifelines import KaplanMeierFitter
    kmf = KaplanMeierFitter()
    ax = plt.subplot(111)
    for i in range(len(u)):
        a=u[i]
        kmf.fit(T[a])
        kmf.plot(ax=ax,ci_force_lines=False)
        plt.ylim(0, 1);
        plt.title("Lifespans of different machines")

Kaplan(data1.Region)

我得到了:KeyError: 'ASIA'有人可以帮助我,我仍然是编码的新手。非常感谢。

python function loops survival-analysis
1个回答
2
投票

根据您在开头的给定代码,您可以执行此操作

from lifelines import KaplanMeierFitter

def Kaplan(dt, time, regions):
    tobefit = lambda region: dt[time][(dt['Region'] == region)]
    ax = plt.subplot(111)
    kmf = KaplanMeierFitter()
    for region in regions:
        kmf.fit(tobefit(region), label=region)
        kmf.plot(ax=ax,ci_force_lines=False)
    plt.ylim(0, 1);
    plt.title("Lifespans of different machines")

Kaplan(data1, "op_hours", ["Asia", "Europe", "North America"])

更新

如果您有固定的参数,并且每次调用该函数时都不想键入它们。您可以使用默认参数定义函数

def Kaplan(dt, time="op_hours", regions=["Asia", "Europe", "North America"]):
    tobefit = lambda region: dt[time][(dt['Region'] == region)]
    ax = plt.subplot(111)
    kmf = KaplanMeierFitter()
    for region in regions:
        kmf.fit(tobefit(region), label=region)
        kmf.plot(ax=ax,ci_force_lines=False)
    plt.ylim(0, 1);
    plt.title("Lifespans of different machines")

# Then you can call your Kaplan function without specifying time and regions  
Kaplan(data1)
© www.soinside.com 2019 - 2024. All rights reserved.