我有一段摘录,需要我识别某种类型的手术
X
,请参见Surg Type
列。
我需要保留在一个窗口/时间段内被视为不同行的医疗预约,其中 3 个预约是之前的 (-3、-2、-1) 和 3 个预约是后的 (+1、+2、+3)
我必须将此订单作为附加栏包含在内。
除此之外,我需要排除窗口外的任何预约和任何其他类型的
Surg Type
,在此示例中,任何手术都表示为 Z。
在此示例中,我想要保留 7/9 行/记录和一个附加列
Prior Post
*** 更新示例***
Original Df
| Patient ID | Surg ID | Surg Type | Surg Date | Medical Appt | Medical Appt Date |
|------------|---------|-----------|------------|--------------|-------------------|
| 1 | 1 | X | 2022-09-03 | Y | 2022-01-01 |
| 1 | 1 | X | 2022-09-03 | Y | 2022-03-04 |
| 1 | 1 | X | 2022-09-03 | Y | 2022-05-04 |
| 1 | 1 | X | 2022-09-03 | N | NaT |
| 1 | 1 | X | 2022-09-03 | Y | 2022-11-04 |
| 1 | 1 | X | 2022-09-03 | Y | 2022-11-29 |
| 1 | 2 | Z | 2022-12-01 | N | NaT |
| 1 | 1 | X | 2022-09-03 | Y | 2023-01-02 |
| 1 | 1 | X | 2022-09-03 | Y | 2023-01-13 |
Desired Df
| Patient ID | Surg ID | Surg Type | Surg Date | Medical Appt | Medical Appt Date | Inclusion |
|------------|---------|-----------|------------|--------------|-------------------|-------------|
| 1 | 1 | X | 2022-09-03 | Y | 2022-01-01 | -3 |
| 1 | 1 | X | 2022-09-03 | Y | 2022-03-04 | -2 |
| 1 | 1 | X | 2022-09-03 | Y | 2022-05-04 | -1 |
| 1 | 1 | X | 2022-09-03 | N | NaT | |
| 1 | 1 | X | 2022-09-03 | Y | 2022-11-04 | +1 |
| 1 | 1 | X | 2022-09-03 | Y | 2022-11-29 | +2 |
| 1 | 2 | Z | 2022-12-01 | N | NaT | Exclude Row |
| 1 | 1 | X | 2022-09-03 | Y | 2023-01-02 | +3 |
| 1 | 1 | X | 2022-09-03 | Y | 2023-01-13 | Exclude row |
# Remove rows that are not Surgery X or appointment
df = df.loc[df["Surg Type"].eq("X") | df["Medical Appt"].eq("Y")].reset_index(drop=True)
# Fill Prior Post column, assuming only one row is of Surg Type X
surgery_idx = df[df["Surg Type"].eq("X")].index[0]
df["Prior Post"] = df.index - surgery_idx
# Remove rows where Prior Post is outside of limits
df = df.loc[(df["Prior Post"] >= -3) & (df["Prior Post"] <= 3)]