Docker:sqlalchemy.exc.OperationalError:(psycopg2.OperationalError)无法将主机名“pgdatabase”转换为地址:未知主机

问题描述 投票:0回答:1

我在一个网络(pg-network)上的 docker 中运行 pgdatabase 和 pgadmin 服务,然后将数据提取到数据库中。下面给出了 docker-compose 和 Dockerfile 的代码以及摄取代码。


我执行的命令如下。

  1. docker-compose up
  2. docker build -t 测试.
  3. python摄取.py

文件

docker-compose.yml

services:
  pgdatabase:
    image: postgres:13
    environment:
      - POSTGRES_USER=root
      - POSTGRES_PASSWORD=root
      - POSTGRES_DB=ny_taxi
    volumes:
      - "./ny_taxi_postgres_data:/var/lib/postgresql/data:rw"
    ports:
      - "5432:5432"
  pgadmin:
    image: dpage/pgadmin4
    environment:
      - [email protected]
      - PGADMIN_DEFAULT_PASSWORD=root
    volumes:
      - "./pgadmin_conn_data:/var/lib/pgadmin:rw"
    ports:
      - "8080:80"

Dockerfile

FROM python:3.9

RUN apt-get install wget
RUN pip install pandas==2.1.2 sqlalchemy==2.0.23 pyarrow==8.0.0 psycopg2==2.9.5 psycopg2-binary==2.9.5

WORKDIR /app
COPY ingest_data.py ingest_data.py 

ENTRYPOINT [ "python", "ingest_data.py" ]

命令:

docker build -t test .

摄取.py

#!/usr/bin/env python
# coding: utf-8
import os
import argparse
from time import time
import pandas as pd
from sqlalchemy import create_engine


def ingest_data(user, password, host, port, db, table_name, csv_url):

    # the backup files are gzipped, and it's important to keep the correct extension
    # for pandas to be able to open the file
    if csv_url.endswith('.csv.gz'):
        csv_name = 'yellow_tripdata_2021-01.csv.gz'
    else:
        csv_name = 'output.csv'

    os.system(f"wget {csv_url} -O {csv_name}")
    postgres_url = f'postgresql://{user}:{password}@{host}:{port}/{db}'
    engine = create_engine(postgres_url)

    df_iter = pd.read_csv(csv_name, iterator=True, chunksize=100000)

    df = next(df_iter)

    df.tpep_pickup_datetime = pd.to_datetime(df.tpep_pickup_datetime)
    df.tpep_dropoff_datetime = pd.to_datetime(df.tpep_dropoff_datetime)

    df.head(n=0).to_sql(name=table_name, con=engine, if_exists='replace')

    df.to_sql(name=table_name, con=engine, if_exists='append')

    while True:

        try:
            t_start = time()

            df = next(df_iter)

            df.tpep_pickup_datetime = pd.to_datetime(df.tpep_pickup_datetime)
            df.tpep_dropoff_datetime = pd.to_datetime(df.tpep_dropoff_datetime)

            df.to_sql(name=table_name, con=engine, if_exists='append')

            t_end = time()

            print('inserted another chunk, took %.3f second' %
                  (t_end - t_start))

        except StopIteration:
            print("Finished ingesting data into the postgres database")
            break


if __name__ == '__main__':
    user = "root"
    password = "root"
    host = "pgdatabase"
    port = "5432"
    db = "ny_taxi"
    table_name = "yellow_taxi_trips"
    csv_url = "https://github.com/DataTalksClub/nyc-tlc-data/releases/download/yellow/yellow_tripdata_2021-01.csv.gz"

    ingest_data(user, password, host, port, db, table_name, csv_url)

运行后

docker-compose up
转到http://localhost:8080/

用户名:[电子邮件受保护]

密码:root

使用主机名设置服务器:pgdatabase

然后运行

python ingest.py
将数据提取到 postgres 数据库中,但我收到错误
sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) could not translate host name "pgdatabase" to address: Unknown host

postgresql docker-compose dockerfile pgadmin-4 data-engineering
1个回答
0
投票

它可能正在验证主机名并期待一些“。”对于 IP 或 TLD。是否可以为您的连接使用完整的数据库连接字符串。

postgres://root:root@pgdatabase:5432/ny_taxi

如果 python 文件不是从容器之一运行,那么主机可能应该是“127.0.0.1”或“localhost”,因为您的主机笔记本电脑无法使用 docker-compose 命名空间,仅在容器内运行的进程。

最后一种可能性是向每个服务添加 container_name: 属性,这对我来说有一两次不同。

© www.soinside.com 2019 - 2024. All rights reserved.