下载gensim数据集时出现ValueError

问题描述 投票:0回答:4

我想下载 gensim glove-wiki-gigaword-100 数据集。这是我的代码

import gensim.downloader as api
model = api.load("glove-wiki-gigaword-100")

但我收到此错误

ValueError: unable to read local cache '/Users/xxx/gensim-data/information.json' during fallback, connect to the Internet and retry

我检查了我的终端上的 gensim 版本并得到了这个,所以我认为它已经安装了

pip3 show gensim
Name: gensim
Version: 3.8.3
Summary: Python framework for fast Vector Space Modelling
Home-page: http://radimrehurek.com/gensim
Author: Radim Rehurek
Author-email: [email protected]
License: LGPLv2.1
Location: /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages
Requires: smart-open, scipy, numpy, six
Required-by: 

我无法弄清楚这一点。我打开和关闭笔记本电脑并重置路由器。尽管出现错误,但我认为这个问题与我的互联网无关?

python gensim
4个回答
1
投票

我可能是错的,但我认为这是因为 gensim 不支持 python 3.8。我降级到3.6,问题解决了


0
投票

请注意,如果您想下载此模型,可以使用 Pycharm 下载

import gensim.downloader as api
model = api.load("glove-wiki-gigaword-100")

但是 genism 不适用于 Python3.8。 所以你可以降级到Python的另一个版本,比如3.4,5,6
正如我检查的那样,模型已下载,但 genism 不起作用。


0
投票

我遇到了同样的问题。操作系统升级后,可以在 3.6 上运行,不能在 3.8 上运行。一个问题是,这段代码按原样不提供除 ValueError 之外的任何有用的调试信息。让我们解决这个问题。

Python 3.8.10 (v3.8.10:3d8993a744, May  3 2021, 08:55:58) 
[Clang 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import logging
>>> logging.basicConfig()
>>> import gensim.downloader as api
>>> model = api.load("glove-wiki-gigaword-100")

添加上面的内容可以发现这实际上是一个 SSL 错误。

ERROR:gensim.downloader:caught non-fatal exception while trying to update gensim-data cache from 'https://raw.githubusercontent.com/RaRe-Technologies/gensim-data/master/list.json'; using local cache at '/Users/marbron/gensim-data/information.json' instead
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 1354, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1252, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1298, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1247, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1007, in _send_output
    self.send(msg)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 947, in send
    self.connect()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1421, in connect
    self.sock = self._context.wrap_socket(self.sock,
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/ssl.py", line 500, in wrap_socket
    return self.sslsocket_class._create(
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/ssl.py", line 1040, in _create
    self.do_handshake()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/ssl.py", line 1309, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/marbron/Projects/hcm-matchfox-testing/.py38/lib/python3.8/site-packages/gensim/downloader.py", line 199, in _load_info
    info_bytes = urlopen(url).read()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 525, in open
    response = self._open(req, data)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 542, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 1397, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 1357, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131)>
Traceback (most recent call last):
  File "/Users/marbron/Projects/hcm-matchfox-testing/.py38/lib/python3.8/site-packages/gensim/downloader.py", line 219, in _load_info
    with io.open(cache_path, 'r', encoding=encoding) as fin:
FileNotFoundError: [Errno 2] No such file or directory: '/Users/marbron/gensim-data/information.json'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/marbron/Projects/hcm-matchfox-testing/.py38/lib/python3.8/site-packages/gensim/downloader.py", line 490, in load
    file_name = _get_filename(name)
  File "/Users/marbron/Projects/hcm-matchfox-testing/.py38/lib/python3.8/site-packages/gensim/downloader.py", line 426, in _get_filename
    information = info()
  File "/Users/marbron/Projects/hcm-matchfox-testing/.py38/lib/python3.8/site-packages/gensim/downloader.py", line 268, in info
    information = _load_info()
  File "/Users/marbron/Projects/hcm-matchfox-testing/.py38/lib/python3.8/site-packages/gensim/downloader.py", line 222, in _load_info
    raise ValueError(
ValueError: unable to read local cache '/Users/marbron/gensim-data/information.json' during fallback, connect to the Internet and retry

关于如何解决上述问题,有几个很好的答案:

但简而言之:

  • pip install -U certifi
  • /Applications/Python 3.X/Install Certificates.command

然后再次执行上面的代码片段,模型就会被下载。


0
投票

谢谢您的回答!

SSL 解决方案对我来说不起作用。 截至 2024 年 4 月,降级到 Python 3.11 解决了该问题。

© www.soinside.com 2019 - 2024. All rights reserved.