删除NLTK停用词

删除NLTK停用词

问题描述投票：0回答：1

我正在尝试删除数据集中的停用词。

stopwordsw = nltk.corpus.stopwords.words('german')

def remove_stopwords(txt_clean):
      txt_clean =  [Word for Word in txt_clean if Word not in stopwords]
      return txt_clean

data['Tweet_sw'] = data['Tweet_clean'].apply(lambda x: remove_stopwords(x))
data.head()

我有两个问题。

[首先，输出是逐个字符给定的（用逗号分隔），尽管我对带有单词的停用词列表进行了检查。

我可以使用join命令解决此问题，但是我不明白为什么将其拆分为字符。

第二个真正的问题是停用词的删除不起作用。列表中清楚列出的单词不会从句子中删除。

我的错误在哪里？

image

nlp

nltk

data-cleaning

stop-words

1个回答

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1