Python正则表达式：替换子字符串的多种可能性

Question

我想删除字符串Fig 1.中的caption之类的指标，其中caption可能是：

# each line is one instance of caption
"Figure 1: Path of Reading Materials from the Web to a Student."
"FIGURE 1 - Travel CP-net"
"Figure 1 Interpretation as abduction, the big picture."
"Fig. 1. The feature vector components"
"Fig 1: IMAGACT Log-in Page"
"FIG 1 ; The effect of descriptive and interpretive information, and Inclination o f Fit"
...

我已经尝试过caption = re.sub(r'figure 1: |fig. 1 |figure 1 -', '', caption, flags=re.IGNORECASE)，但看起来很乱：我真的需要手动列出所有可能性吗？是否有任何元素重新编码以匹配所有元素？

谢谢一堆！

Answer 1

您可能使用不区分大小写的匹配：

\bfig\.?(?:ure)? 1[:.]?

Reges demo

Python正则表达式：替换子字符串的多种可能性

问题描述投票：0回答：1

1个回答

最新问题

Python正则表达式：替换子字符串的多种可能性

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1