在列表中查找与字符串匹配的连续单词

Question

在我们的应用程序中，我们正在读取 PDF 中的文本并对其进行处理。

我遇到了如下情况：

假设我在 PDF 页面中有以下几行：

It's raining in the evening
John is sitting on a desk    *
and working on his laptop    *
Andy enters the building

注意 - * 可以出现在行首或行尾或两者。

从 PDf 读取行后，我删除 * 字符并将行转换为句子。所以，我得到以下句子：

It's raining in the evening
John is sitting on a desk and working on his laptop
Andy enters the building

此 PDF 还使用某些技术进行处理，提供 PDF 中的文字。因此，我得到了该行的单词列表：

It's 
raining 
in 
the 
evening
John 
is 
sitting 
on 
a 
desk 
*
and 
working 
on 
his 
laptop
*
Andy 
enters 
the 
building

我需要找到列表中与特定句子匹配的连续单词。

我编写了一个逻辑，如果 PDF 中没有 * 字符，该逻辑可以正常工作。但如果 * 存在则失败。

知道在这种情况下如何进行匹配吗？

提前致谢！

Answer 1

你能提供一下逻辑吗？您也许可以实现它以忽略 *s。