在 VS Code 中使用 LDA 分析对“百万头条新闻”进行 Uni 分配。
多次运行代码,但在增加 LDA 中的主题数量并重新运行代码后,我的 pyLDAvis 代码:
#create a visualisation of the topics and their word lists and concordance
# Prepare the visualization
pyLDAvis.enable_notebook() # Only if you are in a Jupyter environment
prepared_vis = gensimvis.prepare(lda_model, corpus, id2word)
# Display the visualization
pyLDAvis.display(prepared_vis)
我收到此错误
--------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[36], line 8
5 prepared_vis = gensimvis.prepare(lda_model, corpus, id2word)
7 # Display the visualization
----> 8 pyLDAvis.display(prepared_vis)
File ~/.pyenv/versions/3.12.2/lib/python3.12/site-packages/pyLDAvis/_display.py:222, in display(data, local, **kwargs)
218 warnings.warn(
219 "display: specified urls are ignored when local=True")
220 kwargs['d3_url'], kwargs['ldavis_url'], kwargs['ldavis_css_url'] = write_ipynb_local_js()
--> 222 return HTML(prepared_data_to_html(data, **kwargs))
File ~/.pyenv/versions/3.12.2/lib/python3.12/site-packages/pyLDAvis/_display.py:177, in prepared_data_to_html(data, d3_url, ldavis_url, ldavis_css_url, template_type, visid, use_http)
170 elif re.search(r'\s', visid):
171 raise ValueError("visid must not contain spaces")
173 return template.render(visid=json.dumps(visid),
174 visid_raw=visid,
175 d3_url=d3_url,
176 ldavis_url=ldavis_url,
--> 177 vis_json=data.to_json(),
178 ldavis_css_url=ldavis_css_url)
File ~/.pyenv/versions/3.12.2/lib/python3.12/site-packages/pyLDAvis/_prepare.py:464, in PreparedData.to_json(self)
463 def to_json(self):
--> 464 return json.dumps(self.to_dict(), cls=NumPyEncoder)
File ~/.pyenv/versions/3.12.2/lib/python3.12/json/__init__.py:238, in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
232 if cls is None:
233 cls = JSONEncoder
234 return cls(
235 skipkeys=skipkeys, ensure_ascii=ensure_ascii,
236 check_circular=check_circular, allow_nan=allow_nan, indent=indent,
237 separators=separators, default=default, sort_keys=sort_keys,
--> 238 **kw).encode(obj)
File ~/.pyenv/versions/3.12.2/lib/python3.12/json/encoder.py:200, in JSONEncoder.encode(self, o)
196 return encode_basestring(o)
197 # This doesn't pass the iterator directly to ''.join() because the
198 # exceptions aren't as detailed. The list call should be roughly
199 # equivalent to the PySequence_Fast that ''.join() would do.
--> 200 chunks = self.iterencode(o, _one_shot=True)
201 if not isinstance(chunks, (list, tuple)):
202 chunks = list(chunks)
File ~/.pyenv/versions/3.12.2/lib/python3.12/json/encoder.py:258, in JSONEncoder.iterencode(self, o, _one_shot)
253 else:
254 _iterencode = _make_iterencode(
255 markers, self.default, _encoder, self.indent, floatstr,
256 self.key_separator, self.item_separator, self.sort_keys,
257 self.skipkeys, _one_shot)
--> 258 return _iterencode(o, 0)
File ~/.pyenv/versions/3.12.2/lib/python3.12/site-packages/pyLDAvis/utils.py:150, in NumPyEncoder.default(self, obj)
148 if isinstance(obj, np.float64) or isinstance(obj, np.float32):
149 return float(obj)
--> 150 return json.JSONEncoder.default(self, obj)
File ~/.pyenv/versions/3.12.2/lib/python3.12/json/encoder.py:180, in JSONEncoder.default(self, o)
161 def default(self, o):
162 """Implement this method in a subclass such that it returns
163 a serializable object for ``o``, or calls the base implementation
164 (to raise a ``TypeError``).
(...)
178
179 """
--> 180 raise TypeError(f'Object of type {o.__class__.__name__} '
181 f'is not JSON serializable')
**TypeError: Object of type complex128 is not JSON serializable**
我不知道为什么。尝试重新启动内核并更新 pyLDAvis,但没有任何改善
将主题数量重置为原来的10个而不是25个,没有变化。
如果我所做的只是更改 LDA 的主题数量,会导致这种情况吗?
前段时间我与 LDA 合作发现了类似的问题,不幸的是我无法复制它,但是:
**TypeError: Object of type complex128 is not JSON serializable**
似乎由于某些原因LDA生成了一些复数,所以有2个选项:
更改结果表示方式
prepared_vis = gensimvis.prepare(lda_model, corpus, id2word, mds='mmds')
修改结果在
中的附加方式(CTRL+单击 在 pyLDAvis 上进入包内),修改此类如下:pyLDAvis/utils.py
class NumPyEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, np.int64) or isinstance(obj, np.int32):
return int(obj)
if isinstance(obj, np.float64) or isinstance(obj, np.float32):
return float(obj)
if np.iscomplexobj(obj):
return abs(obj)
return json.JSONEncoder.default(self, obj)