如何用唯一符号替换列表的随机元素？

Question

我是python编程的新手。我有两个列表，第一个列表包含停用词，而另一个包含文本文档。我想用“/”替换文本文档中的停用词。有人可以提供帮助吗？

我使用了替换功能，它给出了一个错误

text = "This is an example showing off word filtration"
stop = `set`(stopwords.words("english"))
text = nltk.word_tokenize(document)

`for` word in stop:
    text = text.replace(stop, "/")
`print`(text)

它应输出“/ / /示例显示/字过滤”

Answer 1

怎么样的list comprehension：

>>> from nltk.corpus import stopwords
>>> from nltk.tokenize import word_tokenize  
>>> stop_words = set(stopwords.words('english'))
>>> text = "This is an example showing off word filtration"
>>> text_tokens = word_tokenize(text) 
>>> replaced_text_words = ["/" if word.lower() in stop_words else word for word in text_tokens]
>>> replaced_text_words
['/', '/', '/', 'example', 'showing', '/', 'word', 'filtration']
>>> replaced_sentence = " ".join(replaced_text_words)
>>> replaced_sentence
/ / / example showing / word filtration

Answer 2

使用正则表达式模式怎么样？

您的代码可能如下所示：

from nltk.corpus import stopwords
import nltk

text = "This is an example showing off word filtration"
text = text.lower()


import re
pattern = re.compile(r'\b(' + r'|'.join(stopwords.words('english')) + r')\b\s*')
text = pattern.sub('/ ', text)

关于这个post。

Answer 3

你应该在你的替换功能中使用word而不是stop。

for word in stop:
    text = text.replace(word, "/")

Answer 4

你可以试试这个

' '/join([item if item.lower() not in stop else "/" for item in text ])

如何用唯一符号替换列表的随机元素？

问题描述投票：3回答：4

4个回答

最新问题

如何用唯一符号替换列表的随机元素？

问题描述 投票：3回答：4

4个回答

最新问题

问题描述投票：3回答：4