Python正则表达式可在变量[closed]中找到“文档”字词

Question

如果可以通过从给定字符串中删除字符来产生“文档”一词，则将从字符串中删除拼写为“文档”的字母。如果可以删除结果字符串中的字母以保留字符串“ document”，则将删除该字符串中拼写为“ document”的字母。这一直持续到无法删除字母以留下“文档”为止，这时将返回最后一个字符串。

例如，如果字符串为：

documdocumententer
     ^^^^^^^^

“文档”可以通过从开头删除“ docum”并在“末尾”输入“ enter”来形成，因此中间的“ document”被删除，剩下

documenter
^^^^^^^^

然后将重复此过程以退出

er

由于“ er”不包含“ document”，因此将返回“ er”。

类似地，如果字符串是：

adbocucdmefgnhtj
 ^ ^^^  ^^  ^ ^

拼写为“文档”的字母将被删除以离开：

abcdfghj

将返回此字符串，因为它不包含“文档”。

示例

doconeument转换为one
documdocumentent转换为empty string
documentone转换为one
pydocdbument转换为pydb
documentdocument转换为empty string

如何从给定的字符串中获得感兴趣的字符串（仅针对特定单词“ document”）。

Answer 1

我有一个带有正则表达式和递归的解决方案：

from re import compile

candidates = ["doconeument", "documdocumentent",  "documentone",
              "pydocdbument", "documentdocument", "hansi"]
word = "document"

def strip_word(word, candidate):
    regex = compile("^(.*)" + "(.*)".join(word) + "(.*)$")
    match = regex.match(candidate)
    if not match:
        return candidate
    return strip_word(word, "".join(match.groups()))

for cand in candidates:
    print(f"'{cand}' -> '{strip_word(word, cand)}'")

编辑：对代码进行了更正（函数的第一行两行留在外面）。

Answer 2

我知道您已经明确声明您需要一个正则表达式，但是如果您不介意不使用正则表达式，可以尝试以下功能：

def clear_word(s, word="document"):
    i = 0
    res = ''
    for char in s:
        try:
            if char == word[i]:
                i += 1
                continue
        except IndexError:
            pass
        res += char
    if i == len(word):
        return clear_word(res, word)
    else:
        return s

您将使用它为：

>>> clear_word("pydocdbument")
'pydb'

Python正则表达式可在变量[closed]中找到“文档”字词

问题描述投票：-2回答：2

2个回答

最新问题

Python正则表达式可在变量[closed]中找到“文档”字词

问题描述 投票：-2回答：2

2个回答

最新问题

问题描述投票：-2回答：2