这个问题已经在这里有一个答案:
我试图用re.findall()找到工作日名称的所有出现。当我排除\b
,而不是当我包括他们它的工作原理。这工作:
any_week_day_long = "([Mm]onday|[Tt]uesday|[Ww]ednesday|[Tt]hursday|[Ff]riday|[Ss]aturday|[Ss]unday)"
match = re.findall(any_week_day_long, "Monday is a great day of the week. Tuesday is pretty good, but Wednesday has it beat.")
但这并不:
any_week_day_long = "\b([Mm]onday|[Tt]uesday|[Ww]ednesday|[Tt]hursday|[Ff]riday|[Ss]aturday|[Ss]unday)\b"
match = re.findall(any_week_day_long, "Monday is a great day of the week. Tuesday is pretty good, but Wednesday has it beat.")
在我看来,像它应该找到星期一,星期二和星期三刚刚与\b
很好,但是当我print
比赛,它只是一个空列表。
\b
:而不是使用\\b
尝试
any_week_day_long = "\\b([Mm]onday|[Tt]uesday|[Ww]ednesday|[Tt]hursday|[Ff]riday|[Ss]aturday|[Ss]unday)\\b"
match = re.findall(any_week_day_long, "Monday is a great day of the week. Tuesday is pretty good, but Wednesday has it beat.")
OUTPUT
['Monday', 'Tuesday', 'Wednesday']
你甚至可以达到使用raw string相同。而不是做类似[M|m]
的,这是更好的使用re.IGNORECASE标志相同。一个更清洁的方式做同样的。
any_week_day_long = r'\b(?:mon|tues|wednes|thurs|fri|satur|sun)day\b'
match = re.findall(any_week_day_long, "Monday is a great day of the week. Tuesday is pretty good, but Wednesday has it beat.")
输出:
['Monday', 'Tuesday', 'Wednesday']