拥抱脸 | ValueError:连接错误,我们在缓存路径中找不到请求的文件。请重试或确保您的互联网连接

问题描述 投票:0回答:5

并非总是如此,但在运行我的代码时偶尔会出现此错误。

起初,我怀疑这是连接问题,而是与兑现问题有关,正如旧的 Git Issue 中所讨论的那样。

清除缓存对运行时没有帮助:

$ rm ~/.cache/huggingface/transformers/ *

追溯参考:

  • NLTK 也获得
    Error loading stopwords: <urlopen error [Errno -2] Name or service not known
  • 最后 2 行是
    cached_path
    get_from_cache

缓存(清除前):

$ cd ~/.cache/huggingface/transformers/
(sdg) me@PF2DCSXD:~/.cache/huggingface/transformers$ ls
16a2f78023c8dc511294f0c97b5e10fde3ef9889ad6d11ffaa2a00714e73926e.cf2d0ecb83b6df91b3dbb53f1d1e4c311578bfd3aa0e04934215a49bf9898df0
16a2f78023c8dc511294f0c97b5e10fde3ef9889ad6d11ffaa2a00714e73926e.cf2d0ecb83b6df91b3dbb53f1d1e4c311578bfd3aa0e04934215a49bf9898df0.json
16a2f78023c8dc511294f0c97b5e10fde3ef9889ad6d11ffaa2a00714e73926e.cf2d0ecb83b6df91b3dbb53f1d1e4c311578bfd3aa0e04934215a49bf9898df0.lock
4029f7287fbd5fa400024f6bbfcfeae9c5f7906ea97afcaaa6348ab7c6a9f351.723d8eaff3b27ece543e768287eefb59290362b8ca3b1c18a759ad391dca295a.h5
4029f7287fbd5fa400024f6bbfcfeae9c5f7906ea97afcaaa6348ab7c6a9f351.723d8eaff3b27ece543e768287eefb59290362b8ca3b1c18a759ad391dca295a.h5.json
4029f7287fbd5fa400024f6bbfcfeae9c5f7906ea97afcaaa6348ab7c6a9f351.723d8eaff3b27ece543e768287eefb59290362b8ca3b1c18a759ad391dca295a.h5.lock
684fe667923972fb57f6b4dcb61a3c92763ad89882f3da5da9866baf14f2d60f.c7ed1f96aac49e745788faa77ba0a26a392643a50bb388b9c04ff469e555241f
684fe667923972fb57f6b4dcb61a3c92763ad89882f3da5da9866baf14f2d60f.c7ed1f96aac49e745788faa77ba0a26a392643a50bb388b9c04ff469e555241f.json
684fe667923972fb57f6b4dcb61a3c92763ad89882f3da5da9866baf14f2d60f.c7ed1f96aac49e745788faa77ba0a26a392643a50bb388b9c04ff469e555241f.lock
c0c761a63004025aeadd530c4c27b860ec4ecbe8a00531233de21d865a402598.5d12962c5ee615a4c803841266e9c3be9a691a924f72d395d3a6c6c81157788b
c0c761a63004025aeadd530c4c27b860ec4ecbe8a00531233de21d865a402598.5d12962c5ee615a4c803841266e9c3be9a691a924f72d395d3a6c6c81157788b.json
c0c761a63004025aeadd530c4c27b860ec4ecbe8a00531233de21d865a402598.5d12962c5ee615a4c803841266e9c3be9a691a924f72d395d3a6c6c81157788b.lock
fc674cd6907b4c9e933cb42d67662436b89fa9540a1f40d7c919d0109289ad01.7d2e0efa5ca20cef4fb199382111e9d3ad96fd77b849e1d4bed13a66e1336f51
fc674cd6907b4c9e933cb42d67662436b89fa9540a1f40d7c919d0109289ad01.7d2e0efa5ca20cef4fb199382111e9d3ad96fd77b849e1d4bed13a66e1336f51.json
fc674cd6907b4c9e933cb42d67662436b89fa9540a1f40d7c919d0109289ad01.7d2e0efa5ca20cef4fb199382111e9d3ad96fd77b849e1d4bed13a66e1336f51.lock

代码:

from transformers import pipeline, set_seed

generator = pipeline('text-generation', model='gpt2')  # Error
set_seed(42)

追溯:

2022-03-03 10:18:06.803989: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-03-03 10:18:06.804057: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
[nltk_data] Error loading stopwords: <urlopen error [Errno -2] Name or
[nltk_data]     service not known>
2022-03-03 10:18:09.216627: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2022-03-03 10:18:09.216700: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2022-03-03 10:18:09.216751: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (PF2DCSXD): /proc/driver/nvidia/version does not exist
2022-03-03 10:18:09.217158: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-03-03 10:18:09.235409: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
All model checkpoint layers were used when initializing TFGPT2LMHeadModel.

All the layers of TFGPT2LMHeadModel were initialized from the model checkpoint at gpt2.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFGPT2LMHeadModel for predictions without further training.
Traceback (most recent call last):
  File "/home/me/miniconda3/envs/sdg/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/me/miniconda3/envs/sdg/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/mnt/c/Users/me/Documents/GitHub/project/foo/bar/__main__.py", line 26, in <module>
    nlp_setup()
  File "/mnt/c/Users/me/Documents/GitHub/project/foo/bar/utils/Modeling.py", line 37, in nlp_setup
    generator = pipeline('text-generation', model='gpt2')
  File "/home/me/miniconda3/envs/sdg/lib/python3.8/site-packages/transformers/pipelines/__init__.py", line 590, in pipeline
    tokenizer = AutoTokenizer.from_pretrained(
  File "/home/me/miniconda3/envs/sdg/lib/python3.8/site-packages/transformers/models/auto/tokenization_auto.py", line 463, in from_pretrained
    tokenizer_config = get_tokenizer_config(pretrained_model_name_or_path, **kwargs)
  File "/home/me/miniconda3/envs/sdg/lib/python3.8/site-packages/transformers/models/auto/tokenization_auto.py", line 324, in get_tokenizer_config
    resolved_config_file = get_file_from_repo(
  File "/home/me/miniconda3/envs/sdg/lib/python3.8/site-packages/transformers/file_utils.py", line 2235, in get_file_from_repo
    resolved_file = cached_path(
  File "/home/me/miniconda3/envs/sdg/lib/python3.8/site-packages/transformers/file_utils.py", line 1846, in cached_path
    output_path = get_from_cache(
  File "/home/me/miniconda3/envs/sdg/lib/python3.8/site-packages/transformers/file_utils.py", line 2102, in get_from_cache
    raise ValueError(
ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on.

失败的尝试

  1. 我关闭了 IDE 和 bash 终端。在 PowerShell 中运行
    wsl.exe --shutdown
    。重新启动 IDE 和 bash 终端,出现同样的错误。
  2. 断开/不同的 VPN。
  3. 清除缓存
    $ rm ~/.cache/huggingface/transformers/ *
python-3.x tensorflow huggingface-transformers valueerror gpt-2
5个回答
2
投票

我在github上看到了一个答案,你可以尝试一下:

force_download=True 传递给 from_pretrained ,这将覆盖缓存并重新下载文件。

链接:https://github.com/huggingface/transformers/issues/8690作者:patil-suraj


1
投票

确保您没有加载具有空路径的分词器。这为我解决了问题。


0
投票

由于我在 conda venv 中工作并使用 Poetry 来处理依赖项,我需要重新 安装 torch - Hugging Face 🤗 Transformers 的依赖项。


首先,安装火炬: PyTorch 的网站 允许您选择确切的安装设置/规范。就我而言,命令是

conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch

然后添加到诗歌:

poetry add torch

两者都需要很长时间才能处理。运行时间恢复正常:)


0
投票

奇怪的是,在尝试部署到 Azure AKS 时,使用自定义 Docker 映像遇到了此问题。

docker 镜像是通过 docker 提交在 docker 容器上构建的,该容器已经在缓存中包含了所有文件。由于缓存已预先填充,我预计在部署期间不会有任何内容访问互联网(访问受到限制),但它失败并显示错误消息:

ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on.

需要在容器中设置这两个环境变量:

TRANSFORMERS_OFFLINE=1
PYTORCH_TRANSFORMERS_CACHE=/home/app_user/.cache/huggingface/transformers

第一个指示它不要访问互联网下载文件,第二个是缓存的路径,其中包含下载的 Huggingface 文件。


0
投票

这个命令差点杀了我

$ rm ~/.cache/huggingface/transformers/ *

(torch2.0.0_cu11.8) (base)  ✘ zanzhuheng@asus  ~/Desktop/Working   main ±  rm ~/.cache/huggingface/transformers/ *
zsh: sure you want to delete all 16 files in /home/zanzhuheng/Desktop/Working [yn]? y
rm: cannot remove '/home/zanzhuheng/.cache/huggingface/transformers/': No such file or directory
rm: cannot remove 'birds': Is a directory
rm: cannot remove 'breast': Is a directory
rm: cannot remove 'breast_ckpt': Is a directory
rm: cannot remove 'Child_Mind': Is a directory
rm: cannot remove 'convnext': Is a directory
rm: cannot remove 'ESCA': Is a directory
rm: cannot remove 'home': Is a directory
rm: cannot remove 'leaves': Is a directory
rm: cannot remove 'Multi_Stain_Deep_Learning_Nature_Medicine_2023': Is a directory
rm: cannot remove 'rmb': Is a directory
rm: cannot remove 'rsna_breast_cancer': Is a directory
rm: cannot remove 'to_the_MICCAI': Is a directory
rm: cannot remove 'UBC_Ovarian_Cancer': Is a directory
rm: cannot remove 'wandb': Is a directory
(torch2.0.0_cu11.8) (base)  ✘ zanzhuheng@asus  ~/Desktop/Working   main ±  
(torch2.0.0_cu11.8) (base)  ✘ zanzhuheng@asus  ~/Desktop/Working   main ±  rm ~/.cache/huggingface/transformers/ *
zsh: sure you want to delete all 14 files in /home/zanzhuheng/Desktop/Working [yn]? n

© www.soinside.com 2019 - 2024. All rights reserved.