运行gensim的LDA模型时出现运行时错误,该如何解决?

问题描述 投票:0回答:1

我遇到运行时错误:

RuntimeError: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.
  0%|          | 0/29 [00:48<?, ?it/s]

[当我尝试运行此代码时:

def topic_model_coherence_generator (corpus, texts, dictionary, start_topic_count=2, end_topic_count=10, step=1, cpus=1):
    models=[]
    coherence_scores = []
    for topic_nums in tqdm(range(start_topic_count, end_topic_count+1, step)):
        lda_model = gensim.models.LdaModel(corpus=bow_corpus, id2word=dictionary, chunksize=1740, alpha='auto', eta='auto',
                                   random_state=42, iterations=500, num_topics=topic_nums, passes=20, eval_every=None)

        cv_coherence_model_lda = gensim.models.CoherenceModel(model=lda_model, corpus=bow_corpus,
                                                      texts=norm_corpus_bigrams, dictionary=dictionary,
                                                      coherence='c_v')

        coherence_score= cv_coherence_model_lda.get_coherence()
        coherence_scores.append(coherence_score)
        models.append(lda_model)
    return models, coherence_scores

lda_models, coherence_scores = topic_model_coherence_generator(corpus=bow_corpus,
                                                               texts=norm_corpus_bigrams,
                                                               dictionary= dictionary,
                                                               start_topic_count=2,
                                                               end_topic_count=30,
                                                               step=1, cpus=16)

我想要的是获取我的语料库的最佳主题数,然后获取主题并解释主题模型结果。我是生物学家,所以我不知道该如何解决。感谢您的帮助

model runtime-error runtime gensim lda
1个回答
1
投票

这是一个好习惯,在使用其他使用Python multiprocessing的代码之前,在Windows上可能需要&,然后才能将代码放入“主”块中。在答案中查看更多详细信息:

https://stackoverflow.com/a/60459949/130288

((…可能还有它引用的另一个答案)。

© www.soinside.com 2019 - 2024. All rights reserved.