多语言依赖关系在多阶段容器化 (Python) 期间导致问题 - Docker

问题描述 投票:0回答:1

在使用多语言、张量流等的项目的容器化步骤中。我从多语言的依赖关系中得到了多阶段容器化的错误。

我可以进行单阶段容器化,并且效果很好。这是单阶段构建的 docker 文件,运行良好。

FROM python:3.9 AS python

# Set the working directory in the container for Python
WORKDIR /app_py

# Copy the Python requirements.txt
COPY requirements.txt .

# Install system-level dependencies needed by polyglot
RUN apt-get update \
    && apt-get install -y libicu-dev libcld2-dev

# Install Python dependencies
RUN pip install --timeout 100000 --no-cache-dir -r requirements.txt
RUN python -m nltk.downloader stopwords

# Copy the Python application to the container
COPY sentence_embedder.py \
  k_means_clustering.py \
  bow_and_tf_idf.py \
  compare_final_results.py \
  universal-sentence-encoder-multilingual_3 \
  stopwords \
  qna_tensors.xlsx \
  Chatbot_Questions_CF_Departments.xlsx \
  ./

# Explicitly copy the Universal Sentence Encoder files
COPY universal-sentence-encoder-multilingual_3 /app_py/universal-sentence-encoder-multilingual_3

# Make port 5000 available to the world outside this container
EXPOSE 5000

# Define command to start Python app
CMD ["gunicorn", "-w", "4", "-b", "0.0.0.0:5000", "sentence_embedder:app"]  

但是当我尝试进行多级容器化时,多语言依赖项会导致错误。

这是用于多阶段构建的 docker 文件。

# Step 2: Use Python as the second base image
FROM python:3.9 AS builder

# Set the working directory in the container for Python
WORKDIR /app_py

# Copy the requirements and Python application to the container
COPY requirements.txt \
    sentence_embedder.py \
    k_means_clustering.py \
    bow_and_tf_idf.py \
    compare_final_results.py \
    universal-sentence-encoder-multilingual_3 \
    stopwords \
    qna_tensors.xlsx \
    Chatbot_Questions_CF_Departments.xlsx \
    ./

# Install system-level dependencies needed by polyglot
RUN apt-get update \
    && apt-get install -y libicu-dev libcld2-dev \
    && pip install --timeout 100000 -r requirements.txt \
    && python -m nltk.downloader stopwords \
    && apt-get clean && rm -rf /var/lib/apt/lists/* 
    
# Set the environment variable for shared libraries
ENV LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
     
# Stage 2: Production stage with slim Python image
FROM python:3.9 AS production

# Install necessary system-level tools and dependencies
RUN apt-get update && \
    apt-get install -y wget gnupg2 libicu-dev libcld2-dev &&\
    apt-get clean && rm -rf /var/lib/apt/lists/*

# Set the working directory in the container for Python
WORKDIR /app_py

RUN echo "env var for front 2"

# Copy only necessary files and dependencies from the builder stage
COPY --from=builder /usr/lib/x86_64-linux-gnu/libtk8.6.so /usr/lib/x86_64-linux-gnu/
COPY --from=builder /usr/local/lib/python3.9/site-packages/ /usr/local/lib/python3.9/site-packages/
COPY --from=builder /usr/local/bin/gunicorn /usr/local/bin/gunicorn
COPY --from=builder /app_py /app_py


# Set the environment variable for shared libraries
ENV LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu/:$LD_LIBRARY_PATH

# Add gunicorn binary directory to the $PATH
ENV PATH="/usr/local/bin:${PATH}"

# Make port 5000 available to the world outside this container
EXPOSE 5000

# Define command to start Python app
CMD ["gunicorn", "-w", "4", "-b", "0.0.0.0:5000", "sentence_embedder:app"]

错误信息一般在两个依赖处;

  File "/app_py/sentence_embedder.py", line 7, in <module>
    from bow_and_tf_idf import BoWandTFIDF
  File "/app_py/bow_and_tf_idf.py", line 4, in <module>
    from polyglot.text import Text
  File "/usr/local/lib/python3.9/site-packages/polyglot/text.py", line 9, in <module>
    from polyglot.detect import Detector, Language
  File "/usr/local/lib/python3.9/site-packages/polyglot/detect/__init__.py", line 1, in <module>
    from .base import Detector, Language
  File "/usr/local/lib/python3.9/site-packages/polyglot/detect/base.py", line 11, in <module>
    from icu import Locale
  File "/usr/local/lib/python3.9/site-packages/icu/__init__.py", line 3, in <module>
    import tkinter as tk
  File "/usr/local/lib/python3.9/tkinter/__init__.py", line 37, in <module>
    import _tkinter # If this fails your Python may not be configured for Tk
ImportError: libtcl8.6.so: cannot open shared object file: No such file or directory

  File "/app_py/sentence_embedder.py", line 7, in <module>
    from bow_and_tf_idf import BoWandTFIDF
  File "/app_py/bow_and_tf_idf.py", line 4, in <module>
    from polyglot.text import Text
  File "/usr/local/lib/python3.9/site-packages/polyglot/text.py", line 9, in <module>
    from polyglot.detect import Detector, Language
  File "/usr/local/lib/python3.9/site-packages/polyglot/detect/__init__.py", line 1, in <module>
    from .base import Detector, Language
  File "/usr/local/lib/python3.9/site-packages/polyglot/detect/base.py", line 11, in <module>
    from icu import Locale
ImportError: cannot import name 'Locale' from 'icu' (/usr/local/lib/python3.9/site-packages/icu/__init__.py)

请注意,多阶段 docker 文件是多次迭代的结果。生产阶段所有附加的 COPY 命令都是因为缺少依赖项。任何帮助将不胜感激。

我尝试构建一个具有多阶段构建的轻量级Python应用程序容器。我尝试单阶段构建,结果容器大小约为 3.2 GB。当我尝试进行多阶段构建时,通常会在多语言依赖项上出现错误。 (文本来自polyglot.text)

python tkinter icu docker-multi-stage-build polyglot
1个回答
0
投票

我已经解决了这个问题,并将分享它,以防它对某人有用。就我而言,tkinter 和“从 icu 导入”错误是因为 ICU 被错误地安装为多语言。我们只需要 PyICU。应从requirements.txt 中删除ICU。希望这能解决这个问题。

此外,对于缺少共享对象文件的错误,如

libicui18n.so.72
,应从构建器阶段复制。

COPY --from=builder /usr/lib/x86_64-linux-gnu/libicui18n.so.72 /usr/lib/x86_64-linux-gnu
COPY --from=builder /usr/lib/x86_64-linux-gnu/libicuuc.so.72 /usr/lib/x86_64-linux-gnu
COPY --from=builder /usr/lib/x86_64-linux-gnu/libicudata.so.72 /usr/lib/x86_64-linux-gnu

我知道还有其他方法,例如使用虚拟环境等,但使用这种方法对我来说更方便。

© www.soinside.com 2019 - 2024. All rights reserved.