运行预训练模型时 Cuda 内存不足

问题描述 投票:0回答:1

我是 pytorch 世界的新手,我使用搜索和其他几个来源来摆脱 CUDA 内存错误,但运气不佳,也许这里的任何人都有解决方案。

我有以下代码并想简单地运行它:

from PIL import Image

import torch
from transformers import AutoProcessor, LlavaForConditionalGeneration

model_id = "llava-hf/llava-1.5-13b-hf"

prompt = "USER: <image>\nWhat are these?\nASSISTANT:"
image_file = "http://images.cocodataset.org/val2017/000000039769.jpg"

model = LlavaForConditionalGeneration.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    low_cpu_mem_usage=True,
).to(0)

processor = AutoProcessor.from_pretrained(model_id)

raw_image = Image.open(requests.get(image_file, stream=True).raw)
inputs = processor(prompt, raw_image, return_tensors='pt').to(0, torch.float16)

output = model.generate(**inputs, max_new_tokens=200, do_sample=False)
print(processor.decode(output[0][2:], skip_special_tokens=True))



如果我启动程序,我会立即收到标准 CUDA 内存不足错误。

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.146.02             Driver Version: 535.146.02   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 4070 Ti     Off | 00000000:01:00.0 Off |                  N/A |
| 30%   57C    P0              34W / 285W |      0MiB / 12282MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

难道显卡真的太弱了?我无法想象,因为使用 CPU 运行脚本大约需要 20 秒?尝试了所有与批量大小清除缓存重新启动。有谁知道或可以为我指出正确的方向来运行预训练模型?

python tensorflow pytorch cuda
1个回答
0
投票

您是否已尝试使用本地图像而不是 URL 中的图像?

您还可以尝试使用较小的图像样本或较小的标记。

© www.soinside.com 2019 - 2024. All rights reserved.