类型错误:字符串索引必须是整数 - 转换 to_datetime 并应用 if elif 函数

问题描述 投票:0回答:1

我有一个名为 merged_df 的数据框,它具有以下所有非空对象的格式:

merged_df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 889 entries, 0 to 888
Data columns (total 7 columns):
 #   Column           Non-Null Count  Dtype 
---  ------           --------------  ----- 
 0   UNIQUE_ID        889 non-null    object
 1   REGISTERED_NAME  889 non-null    object
 2   EMAIL            889 non-null    object
 3   DBS_CHECK_DATE   889 non-null    object
 4   EXPIRY_DATE      889 non-null    object
 5   UNIQUE_ID        889 non-null    object
 6   Status           889 non-null    object
dtypes: object(7)
memory usage: 48.7+ KB

目标只是添加一个新列 - 它基于涉及 EXPIRY_DATE 列和 Status 的一组条件

import pandas as pd
import datetime
from dateutil.relativedelta import relativedelta

def current_month():
    return datetime.datetime.now().strftime("%m/%Y")

def get_current_date():
    return datetime.datetime.now().strftime("%d/%m/%Y")

def three_month_ahead():
    current = datetime.datetime.now()
    three_month = current + relativedelta(months=3)
    return three_month.strftime("%m/%Y")

def next_month_expiry():
    current = datetime.datetime.now()
    nextmonth = current + relativedelta(months=1)
    return nextmonth.strftime("%m/%Y")

def year_ahead():
    current = datetime.datetime.now()
    year_on = current + relativedelta(months = 12)
    return year_on.strftime("%d/%m/%Y")
merged_df['Action'] = '' # the column to which I apply the below function to 
def action_col(row):
    expiry_date = row['EXPIRY_DATE']
    unique_id = row['UNIQUE_ID']
    three_months_ahead = pd.to_datetime(three_month_ahead(), format='%m/%Y')
    next_month = pd.to_datetime(next_month_expiry(), format='%m/%Y')
    current_date_today = pd.to_datetime(get_current_date(), format='%d/%m/%Y')
    year_on = pd.to_datetime(year_ahead(), format='%d/%m/%Y')

    if f"{expiry_date.year}-{expiry_date.month:02}" == f"{three_months_ahead.year}-{three_months_ahead.month:02}":
        return 'Send 3 month request'
    elif expiry_date.month == next_month.month and expiry_date.year == next_month.year:
        return 'Send 1 month reminder'
    elif expiry_date < current_date_today and row['Status'] == "Not Suspended":
        return 'DBS expired: Suspend & update iAdmin notes'
    elif (year_on < expiry_date < current_date_today) and row['Status'] == "Suspended":
        return 'No action needed - correct suspensions in place'
    elif unique_id in SAP_only_EAs:
        return 'No action needed - SAP only assessor'
    elif (expiry_date < year_on) and row['Status'] == "Suspended":
        return 'DBS expired for over a year – look at whether account closure is appropriate'
    else:
        return 'No action required – valid DBS check'
merged_df['Action'] = merged_df['Action'].apply(action_col)

它抛出“字符串索引必须是索引”错误并指向 expiry_date = row['EXPIRY_DATE'] 列,非常感谢任何帮助(希望问题比我的第一篇文章更有意义,哈哈)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
c:\Users\orla_quidos\Code\DBS project\STATUS section of the scheduled job python script.ipynb Cell 7 line 5
     54     else:
     55         return 'No action required – valid DBS check'
---> 58 merged_df['Action'] = merged_df['Action'].apply(action_col)

File c:\Users\orla_quidos\anaconda3\lib\site-packages\pandas\core\series.py:4771, in Series.apply(self, func, convert_dtype, args, **kwargs)
   4661 def apply(
   4662     self,
   4663     func: AggFuncType,
   (...)
   4666     **kwargs,
   4667 ) -> DataFrame | Series:
   4668     """
   4669     Invoke function on values of Series.
   4670 
   (...)
   4769     dtype: float64
   4770     """
-> 4771     return SeriesApply(self, func, convert_dtype, args, kwargs).apply()

File c:\Users\orla_quidos\anaconda3\lib\site-packages\pandas\core\apply.py:1123, in SeriesApply.apply(self)
   1120     return self.apply_str()
   1122 # self.f is Callable
...
---> 35     expiry_date = row['EXPIRY_DATE']
     36     unique_id = row['UNIQUE_ID']
     37     three_months_ahead = pd.to_datetime(three_month_ahead(), format='%m/%Y')

TypeError: string indices must be integers
python function typeerror datetime-format
1个回答
0
投票

您正在处理多个专栏。

merged_df['Action'] = merged_df['Action'].apply(action_col)
仅适用于
Action
列。

试试这个:

merged_df['Action'] = merged_df.apply(action_col,axis=1)

并且无需创建空列:

merged_df['Action'] = ''  # no need.
© www.soinside.com 2019 - 2024. All rights reserved.