我正在使用 airflow docker-compose 文件 启动一些容器以在本地运行气流。
对于我的 DAG,我正在使用
DockerOperator
来安排 Docker Image 的运行。在这张图片中,我有一个 Python 脚本,其中部分脚本会连接到 Google Cloud Secret Manager 以获取我存储在那里的 API 密钥。
我在 Python 脚本中有这些行来使用 Google Cloud 进行身份验证:
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="/tmp/keys/keys.json"
我可以自己运行 Python 文件,它可以运行,没问题。
主要目标是从其容器中运行此脚本,并通过 Airflow 安排容器的运行。但是,当我尝试运行 DAG 时,它出错并显示:
File "/usr/local/lib/python3.10/site-packages/google/auth/_default.py", line 121, in load_credentials_from_file
raise exceptions.DefaultCredentialsError(
google.auth.exceptions.DefaultCredentialsError: File /tmp/keys/keys.json was not found.; 93)
[2023-02-28, 01:51:15 UTC] {local_task_job.py:208} INFO - Task exited with return code 1
[2023-02-28, 01:51:15 UTC] {taskinstance.py:2578} INFO - 0 downstream tasks scheduled from follow-on schedule check
在
docker-compose.yml
文件中,我添加了一行来将文件挂载到Airflow容器中,如下所示:
x-airflow-common:
&airflow-common
image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.5.1}
environment:
&airflow-common-env
AIRFLOW__CORE__EXECUTOR: CeleryExecutor
AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: postgresql+psycopg2://${POSTGRES_USER}:${POSTGRES_PASSWORD}@postgres/${POSTGRES_DB}
# For backward compatibility, with Airflow <2.3
AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://${POSTGRES_USER}:${POSTGRES_PASSWORD}@postgres/${POSTGRES_DB}
AIRFLOW__CELERY__RESULT_BACKEND: db+postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@postgres/${POSTGRES_DB}
AIRFLOW__CELERY__BROKER_URL: redis://:@redis:6379/0
AIRFLOW__CORE__FERNET_KEY: ''
AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
AIRFLOW__CORE__LOAD_EXAMPLES: 'false'
AIRFLOW__API__AUTH_BACKENDS: 'airflow.api.auth.backend.basic_auth,airflow.api.auth.backend.session'
_PIP_ADDITIONAL_REQUIREMENTS: ${_PIP_ADDITIONAL_REQUIREMENTS:-}
volumes:
- ${AIRFLOW_PROJ_DIR:-.}/dags:/opt/airflow/dags
- ${AIRFLOW_PROJ_DIR:-.}/logs:/opt/airflow/logs
- ${AIRFLOW_PROJ_DIR:-.}/plugins:/opt/airflow/plugins
- /tmp/keys/keys.json:/tmp/keys/keys.json # <------ added here
- /var/run/docker.sock:/var/run/docker.sock
user: "${AIRFLOW_UID:-50000}:0"
depends_on:
&airflow-common-depends-on
redis:
condition: service_healthy
postgres:
condition: service_healthy
当我用
docker-compose up
启动容器时,我进入其中一个容器,我可以找到并看到 /tmp/keys/keys.json
所以我知道它在那里:
但是当我触发 DAG 时,它说找不到它?不确定我是否正确安装它或者我需要为容器做些什么才能找到路径和文件。
DAG:
from airflow import DAG
from airflow.providers.docker.operators.docker import DockerOperator
from datetime import datetime
with DAG("population", start_date=datetime(2023, 1, 1), schedule_interval='@daily', catchup=False) as dag:
task_a = DockerOperator (
task_id="task_a",
image='population:1.0',
command='python3 population.py',
docker_url='tcp://docker-proxy:2375',
network_mode='host',
container_name='population-container'
)