为什么樱桃匹配和不匹配？

Question

在字符串

"Cherry/berry"

中搜索字符串

"cherry"

，我本以为使用

re.IGNORECASE

或

str.lower()

会给出相同的结果，但事实并非如此，为什么？

import pandas as pd
import re

data = {"description": ["Cherry/berry"]}
df = pd.DataFrame(data)


is_contained = df["description"].str.contains(r"\b(cherry)\b", re.IGNORECASE)
print(is_contained[0]) # False

is_contained = df["description"].str.lower().str.contains(r"\b(cherry)\b")
print(is_contained[0]) # True

Answer 1

正如评论中所解释的，您不应将正则表达式标志传递给

case

的

contains 参数：

df["description"].str.contains(r"\b(cherry)\b", case=re.IGNORECASE)

相反，您需要使用

flags

参数：

df["description"].str.contains(r"\b(cherry)\b", flags=re.IGNORECASE)

或者使用

case

参数：

df["description"].str.contains(r"\b(cherry)\b", case=False)

输出：

0    True
Name: description, dtype: bool

为什么失败了？

正则表达式标志只不过是整数（

re.IGNORECASE

实际上是

）。当传递给 case 参数（需要一个布尔值）时，您实际上正在运行：

df["description"].str.contains(r"\b(cherry)\b", case=bool(2))

# or
df["description"].str.contains(r"\b(cherry)\b", case=True)

为什么樱桃匹配和不匹配？

问题描述投票：0回答：1

1个回答

为什么失败了？

最新问题

为什么樱桃匹配和不匹配？

问题描述 投票：0回答：1

1个回答

为什么失败了？

最新问题

问题描述投票：0回答：1