如果必须对现有项目进行 dockerize,该项目使用 setuptools 从
setup.py
文件进行构建,而不是 requirements.txt
。
此构建包括大型二进制文件下载(pytorch、fast-whisper),并在运行时构建后初始下载相应模型。总共~10GB。
为了正确完成构建,我需要在安装之前
COPY
安装文件,这会导致每次更改源代码文件时都会重新构建。
FROM python:3.11-slim
WORKDIR /app
RUN apt update && \
apt install -y --no-install-recommends git ffmpeg curl
COPY setup.py /app
# this is the problem:
# if I move this line behind the next line,
# the build will result in an incomplete package
# but if I keep it here, all the following
# layers will not be cached and the
# downloads will run again
COPY mypackage /app/mypackage
# runs setuptools and installs deps,
# including 2.2GB pytorch
RUN pip install ./ --extra-index-url https://download.pytorch.org/whl/cu118
# downloads ~8GB of models
RUN ["mypackage", "init"]
# I would love to move COPY of the project
# files to this position
CMD ["mypackage", "start"]
setup.py
文件的内容from setuptools import setup, find_packages
from distutils.util import convert_path
import platform
system = platform.system()
if system in ["Windows","Linux"]:
torch = "torch==2.0.0+cu118"
if system == "Darwin":
torch = "torch==2.0.0"
main_ns = {}
ver_path = convert_path('mypackage/version.py')
with open(ver_path) as ver_file:
exec(ver_file.read(), main_ns)
setup(
name='aTrain',
version=main_ns['__version__'],
readme="README.md",
license="LICENSE",
python_requires=">=3.10",
install_requires=[
torch,
"torchaudio==2.0.1",
"faster-whisper>=0.8",
"transformers",
"ffmpeg-python>=0.2",
"pandas",
"pyannote.audio==3.0.0",
"Flask==2.3.2",
"pywebview==4.2.2",
"flaskwebgui",
"screeninfo==0.8.1",
"wakepy==0.7.2",
"show-in-file-manager==1.1.4"
],
packages=find_packages(),
include_package_data=True,
entry_points={
'console_scripts': ['mypackage = mypackage:cli',]
}
)
我对这一切还很陌生,我想知道我有什么选择可以避免下载所有内容
您可以先尝试安装install_requires。 在这里您可以找到层如何工作的描述(https://docs.docker.com/build/cache/)。
# dockerfile
FROM python:3.11-slim
WORKDIR /app
RUN apt update && \
apt install -y --no-install-recommends git ffmpeg curl
COPY requirements.txt setup.py . # you can use . here since you already changed workdir
# install deps - this is the task which takes a long time
RUN pip install -r requirements.txt
COPY mypackage ./mypackage
# runs setuptools
# including 2.2GB pytorch
RUN pip install ./ --extra-index-url https://download.pytorch.org/whl/cu118
# downloads ~8GB of models
RUN ["mypackage", "init"]
# I would love to move COPY of the project
# files to this position
CMD ["mypackage", "start"]
# requirements.txt
torch
torchaudio==2.0.1
faster-whisper>=0.8
transformers
ffmpeg-python>=0.2
pandas
pyannote.audio==3.0.0
Flask==2.3.2
pywebview==4.2.2
flaskwebgui
screeninfo==0.8.1
wakepy==0.7.2
show-in-file-manager==1.1.4