在函数中返回None：TypeError：类型为'NoneType'的对象没有len（）

Question

我正在尝试从LDA中的每个主题打印我的主题和文本。但是打印主题后的“无”会破坏我的脚本。我可以打印我的主题但不打印文本。

import pandas
import numpy as np
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.decomposition import LatentDirichletAllocation

n_top_words = 5
n_components = 5

def print_top_words(model, feature_names, n_top_words):
    for topic_idx, topic in enumerate(model.components_):
        message = "Topic #%d: " % topic_idx
        message += " ".join([feature_names[i] for i in topic.argsort()[:-n_top_words - 1:-1]])

        return message

text = pandas.read_csv('text.csv', encoding = 'utf-8')
text_list = text.values.tolist()

tf_vectorizer = CountVectorizer()
tf = tf_vectorizer.fit_transform(text_list)

lda = LatentDirichletAllocation(n_components=n_components, learning_method='batch', max_iter=25, random_state=0)

doc_distr = lda.fit_transform(tf)

tf_feature_names = tf_vectorizer.get_feature_names()
print (print_top_words(lda, tf_feature_names, n_top_words))

doc_distr = lda.fit_transform(tf)
topics = print_top_words(lda, tf_feature_names, n_top_words)
for i in range(len(topics)):
    print ("Topic {}:".format(i))
    docs = np.argsort(doc_distr[:, i])[::-1]
    for j in docs[:10]:
       print (" ".join(text_list[j].split(",")[:2]))

我的输出：

Topic 0: no order mail received back 

Topic 1: cancel order wishes possible wish 

Topic 2: keep current informed delivery order 

Topic 3: faulty wooden box present side 

Topic 4: delivered received be produced urgent 

Topic 5: good waiting day response share

接下来是这个错误：

  File "lda.py", line 41, in <module>

    for i in range(len(topics)):

TypeError: object of type 'NoneType' has no len()

Answer 1

你的print_top_words()函数有（至少）四个问题。

第一个 - 导致你当前的问题 - 是，如果model.components_为空，for循环的主体将不会执行，然后你的函数将（隐式地）返回qazxsw poi。

第二个是更微妙的：如果None不为空，函数将只返回第一个消息，然后返回并退出 - 这是model.components_语句的定义：返回一个值（如果没有指定值，则返回return））并退出该功能。

第三个问题是（当None不为空时），该函数返回一个字符串，其中调用代码显然需要一个列表。这是一个微妙的错误，因为字符串有一个长度，所以model.components_上的for循环似乎有效，但range(len(topics))肯定不是你期望的值。

最后，函数命名非常糟糕，因为它没有“打印”任何东西 - 与前三个问题相比，这似乎微不足道，并且它不会阻止代码确实工作（假设前三个问题是固定的），但是对代码的推理本身就很困难，所以正确的命名很重要，因为它可以大大减少认知负担并使维护/调试更容易。

长话短说：想想你真正希望这个功能做什么并适当地修复它。我不会在这里发布“更正”版本，因为我不确定你要做什么，但上述说明应该有所帮助。

注意：另外，你用完全相同的参数调用len(topics)和doc_distr = lda.fit_transform(tf)两次，这要么是完全无用的，纯粹浪费处理器周期（在最好的情况下），或者如果你得到不同的结果，还有另一个bug的气味第二个电话。

Answer 2

如果不了解print_top_words(lda, tf_feature_names, n_top_words)的内部运作，这有点难以回答。但是，它的components_与它有关，因为它的重复迭代会产生不同的结果。

您最有可能通过更改以下内容来避免此错误：

LatentDirichletAllocation

至：

print (print_top_words(lda, tf_feature_names, n_top_words))

doc_distr = lda.fit_transform(tf)
topics = print_top_words(lda, tf_feature_names, n_top_words)

第二次调用该函数时，model.components_什么都不返回，因此跳过循环并且函数返回none。

但是，我不确定这是否是代码的实际意图。看起来您可能希望print_top_words成为生成器？你在for循环中返回，这使得它永远不会达到第二次迭代。这可能不是循环的意思。

Answer 3

您没有提供完整的代码，但最可能的原因是变量temp = print_top_words(lda, tf_feature_names, n_top_words) print (temp) doc_distr = lda.fit_transform(tf) topics = print_top_words(temp)是None。唯一可能发生的方法是，如果你的topics函数中的model.components_是一个空集合，那么循环永远不会运行，并且函数（隐式）返回None。检查集合的值。更好的是，在这种情况下选择要返回的值。

另一个不相关的要点：在每次迭代时初始化你的print_top_words变量，并在每次迭代时返回它。检查你的意思。

在函数中返回None：TypeError：类型为'NoneType'的对象没有len（）

问题描述投票：-1回答：3

3个回答

最新问题

在函数中返回None：TypeError：类型为'NoneType'的对象没有len（）

问题描述 投票：-1回答：3

3个回答

最新问题

问题描述投票：-1回答：3