我想删除字符串Fig 1.
中的caption
之类的指标,其中caption
可能是:
# each line is one instance of caption
"Figure 1: Path of Reading Materials from the Web to a Student."
"FIGURE 1 - Travel CP-net"
"Figure 1 Interpretation as abduction, the big picture."
"Fig. 1. The feature vector components"
"Fig 1: IMAGACT Log-in Page"
"FIG 1 ; The effect of descriptive and interpretive information, and Inclination o f Fit"
...
我已经尝试过caption = re.sub(r'figure 1: |fig. 1 |figure 1 -', '', caption, flags=re.IGNORECASE)
,但看起来很乱:我真的需要手动列出所有可能性吗?是否有任何元素重新编码以匹配所有元素?
谢谢一堆!