创建一个新的 df 列并根据另一列中的子字符串有条件地分配值

Question

熊猫初学者，如果这看起来微不足道，请道歉。

尝试在数据框中创建一个新列，每行的值取决于某个子字符串是否出现在同一行的前一列中。

我使用的代码如下所示：

def function(x):
    if "a" or "b" or "c" in x:
        return "string"
    elif "d" or "e" or "f" in x:
        return "other string"
    else:
        return "default string"

df['new col'] = df['col'].apply(function)
print(df)

这成功地在数据框的末尾添加了另一列，但该列中的每个值都是“字符串”。我该如何防止这种情况发生？

如果上下文有用，df['col'].dtype 会输出对象。

Answer 1

问题在于您如何使用“或”运算符。我修复了你的功能：

def function(x):

    # Check if 'a', 'b', or 'c' is in the string x
    if "a" in x or "b" in x or "c" in x:
        return "string"

    # Check if 'd', 'e', or 'f' is in the string x
    elif "d" in x or "e" in x or "f" in x:
        return "other string"

    else:
        return "default string"

# Apply this function to each row of column 'col'
df['new col'] = df['col'].apply(function)
print(df)

创建一个新的 df 列并根据另一列中的子字符串有条件地分配值

问题描述投票：0回答：1

1个回答

最新问题

创建一个新的 df 列并根据另一列中的子字符串有条件地分配值

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1