使用 OSM 数据预填充 PostGIS 适用于 Docker Desktop,但不适用于 Azure 管道

问题描述 投票:0回答:1

当我通过在 Windows 笔记本上手动运行命令来构建下面的 Dockerfile 时

docker build -f Dockerfile --no-cache --tag postgis-berlin:16-3.4 \
    --build-arg DOWNLOAD_URLS="https://download.geofabrik.de/europe/germany/berlin-latest.osm.pbf" \
    --build-arg OSM_PASSWORD=osm_password \
    --build-arg POSTGIS_TAG=16-3.4 .

然后它按预期工作,预填充的 PostgreSQL 数据库位于 Docker Desktop 中启动的容器中:

但是当我通过 Azure 管道构建相同的 Dockerfile 时:

- task: AzureCLI@2
  displayName: Build and push PostGIS image
  inputs:
    azureSubscription: $(ArmConnection)
    scriptType: bash
    scriptLocation: inlineScript
    inlineScript: |
      image_tag='$(ContainerRegistry)/postgis-${{ parameters.OsmRegion }}:${{ parameters.PostGisTag }}'
      # delete old images to avoid the error "no space left on device"
      docker system prune --all --force
      docker build --file $(Build.SourcesDirectory)/src/Services/PostGIS/Dockerfile --no-cache --tag $image_tag --build-arg POSTGIS_TAG=${{parameters.PostGisTag}} --build-arg OSM_PASSWORD=$(OsmPassword) --build-arg DOWNLOAD_URLS="$(DownloadUrls)" $(Build.SourcesDirectory)/src/
      # building the Docker image takes longer time, so be sure to acr login just before pushing the image
      az acr login -n $(ContainerRegistry)
      docker push $image_tag

然后从 ACR 中拉取构建的镜像并在 Kubernetes 上启动它,然后数据库文件夹为空:

我知道在“docker build”期间启动和停止服务可能不是推荐的做法。

我没有通过在 /docker-entrypoint-initdb.d 文件夹中运行脚本来填充数据库的原因是,在生产中我们使用 28 GB 大 europe-latest.osm.pbf (而不是小 Berlin 文件),因此启动一个 Pod 需要几个小时。

下面是我的 Dockerfile,我通过查看 https://github.com/postgis/docker-postgis/blob/master/Dockerfile.alpine.template

创建了它
ARG POSTGIS_TAG
FROM postgis/postgis:$POSTGIS_TAG
EXPOSE 5432

# Install ps, top, netstat, curl, wget, osmium, osm2pgsql
RUN apt-get update && \
    apt-get upgrade -y && \
    apt install -y procps net-tools curl wget osmium-tool osm2pgsql

WORKDIR /data
RUN chown -R postgres:postgres /data
USER postgres

# Download one or multiple OSM files
ARG DOWNLOAD_URLS
RUN wget --no-verbose $DOWNLOAD_URLS
# Print summaries of the downloaded OSM files
RUN for f in *.osm.pbf; do osmium fileinfo -e "$f"; done
# Merge one or multiple files into a new file map.osm.pbf
RUN osmium merge *.osm.pbf -o map.osm.pbf
# Remove all files except the map.osm.pbf
RUN find . -type f -name '*.osm.pbf' | grep -v 'map.osm.pbf' | xargs rm -f

ARG OSM_PASSWORD
ENV PGPASSWORD=$OSM_PASSWORD
ENV PGUSER=osm_user
ENV PGDATABASE=osm_database
ENV PGDATA=/var/lib/postgresql/data

# create PostgreSQL instance in the $PGDATA folder
# configure PostgreSQL as recommended by osm2pgsql doc
# start PostgreSQL and create database and user
# load data from the map.osm.pbf file into the database
# remove the map.osm.pbf file and stop PostgreSQL
# configure password based access for osm_user
RUN set -eux && \
    pg_ctl init && \
    echo "shared_buffers = 1GB"                >> $PGDATA/postgresql.conf && \
    echo "work_mem = 50MB"                     >> $PGDATA/postgresql.conf && \
    echo "maintenance_work_mem = 10GB"         >> $PGDATA/postgresql.conf && \
    echo "autovacuum_work_mem = 2GB"           >> $PGDATA/postgresql.conf && \
    echo "wal_level = minimal"                 >> $PGDATA/postgresql.conf && \
    echo "checkpoint_timeout = 60min"          >> $PGDATA/postgresql.conf && \
    echo "max_wal_size = 10GB"                 >> $PGDATA/postgresql.conf && \
    echo "checkpoint_completion_target = 0.9"  >> $PGDATA/postgresql.conf && \
    echo "max_wal_senders = 0"                 >> $PGDATA/postgresql.conf && \
    echo "random_page_cost = 1.0"              >> $PGDATA/postgresql.conf && \
    echo "password_encryption = scram-sha-256" >> $PGDATA/postgresql.conf && \
    pg_ctl start && \ 
    createuser --username=postgres $PGUSER && \
    createdb --username=postgres --encoding=UTF8 --owner=$PGUSER $PGDATABASE && \
    psql --username=postgres $PGDATABASE --command="ALTER USER $PGUSER WITH PASSWORD '$PGPASSWORD';" && \
    psql --username=postgres $PGDATABASE --command='CREATE EXTENSION IF NOT EXISTS postgis;' && \
    psql --username=postgres $PGDATABASE --command='CREATE EXTENSION IF NOT EXISTS hstore;' && \
    osm2pgsql --username=$PGUSER --database=$PGDATABASE --create --cache=60000 --hstore --latlong /data/map.osm.pbf && \
    rm -f /data/map.osm.pbf && \
    pg_ctl stop && \
    echo '# TYPE DATABASE USER ADDRESS METHOD'                > $PGDATA/pg_hba.conf && \
    echo "local all postgres peer"                           >> $PGDATA/pg_hba.conf && \
    echo "local $PGDATABASE $PGUSER           scram-sha-256" >> $PGDATA/pg_hba.conf && \
    echo "host  $PGDATABASE $PGUSER 0.0.0.0/0 scram-sha-256" >> $PGDATA/pg_hba.conf

我也在 docker-postgis Github 讨论中问了同样的问题

docker azure-pipelines postgis openstreetmap osm2pgsql
1个回答
0
投票

用我这边的

Ubuntu-latest
代理检查相同的Dockerfile,它会报告
Out of memory
错误:

当我将 Dockerfile 中的命令的

60000
减少为
10000
后,管道在我这边工作正常,
RUN ls -al $PGDATA
的输出是正确的。

Dockerfile:

ARG POSTGIS_TAG
FROM postgis/postgis:$POSTGIS_TAG
EXPOSE 5432

# Install ps, top, netstat, curl, wget, osmium, osm2pgsql
RUN apt-get update && \
    apt-get upgrade -y && \
    apt install -y procps net-tools curl wget osmium-tool osm2pgsql

WORKDIR /data
RUN chown -R postgres:postgres /data
USER postgres

# Download one or multiple OSM files
ARG DOWNLOAD_URLS

RUN wget --no-verbose $DOWNLOAD_URLS
# Print summaries of the downloaded OSM files

RUN for f in *.osm.pbf; do osmium fileinfo -e "$f"; done

# Merge one or multiple files into a new file map.osm.pbf
RUN osmium merge *.osm.pbf -o map.osm.pbf
# Remove all files except the map.osm.pbf

RUN find . -type f -name '*.osm.pbf' | grep -v 'map.osm.pbf' | xargs rm -f

ARG OSM_PASSWORD
ENV PGPASSWORD=$OSM_PASSWORD
ENV PGUSER=osm_user
ENV PGDATABASE=osm_database
ENV PGDATA=/var/lib/postgresql/data

RUN set -eux && \
    pg_ctl init && \
    echo "shared_buffers = 1GB"                >> $PGDATA/postgresql.conf && \
    echo "work_mem = 50MB"                     >> $PGDATA/postgresql.conf && \
    echo "maintenance_work_mem = 10GB"         >> $PGDATA/postgresql.conf && \
    echo "autovacuum_work_mem = 2GB"           >> $PGDATA/postgresql.conf && \
    echo "wal_level = minimal"                 >> $PGDATA/postgresql.conf && \
    echo "checkpoint_timeout = 60min"          >> $PGDATA/postgresql.conf && \
    echo "max_wal_size = 10GB"                 >> $PGDATA/postgresql.conf && \
    echo "checkpoint_completion_target = 0.9"  >> $PGDATA/postgresql.conf && \
    echo "max_wal_senders = 0"                 >> $PGDATA/postgresql.conf && \
    echo "random_page_cost = 1.0"              >> $PGDATA/postgresql.conf && \
    echo "password_encryption = scram-sha-256" >> $PGDATA/postgresql.conf && \
    pg_ctl start && \ 
    createuser --username=postgres $PGUSER && \
    createdb --username=postgres --encoding=UTF8 --owner=$PGUSER $PGDATABASE && \
    psql --username=postgres $PGDATABASE --command="ALTER USER $PGUSER WITH PASSWORD '$PGPASSWORD';" && \
    psql --username=postgres $PGDATABASE --command='CREATE EXTENSION IF NOT EXISTS postgis;' && \
    psql --username=postgres $PGDATABASE --command='CREATE EXTENSION IF NOT EXISTS hstore;' && \
    osm2pgsql --username=$PGUSER --database=$PGDATABASE --create --cache=10000 --hstore --latlong /data/map.osm.pbf && \
    rm -f /data/map.osm.pbf && \
    pg_ctl stop && \
    echo '# TYPE DATABASE USER ADDRESS METHOD'                > $PGDATA/pg_hba.conf && \
    echo "local all postgres peer"                           >> $PGDATA/pg_hba.conf && \
    echo "local $PGDATABASE $PGUSER           scram-sha-256" >> $PGDATA/pg_hba.conf && \
    echo "host  $PGDATABASE $PGUSER 0.0.0.0/0 scram-sha-256" >> $PGDATA/pg_hba.conf

RUN ls -al $PGDATA

管道 yaml(

to avoid any value missing, i defined the parameters as variables
)。

trigger: none

variables:
  - name: tag
    value: $(Build.BuildId)
  - name: DownloadUrls
    value: https://download.geofabrik.de/europe/germany/berlin-latest.osm.pbf
  - name: image_tag
    value: mycontainerregistry.azurecr.io/postgis-test:16-3.4
  - name: POSTGIS_TAG
    value: 16-3.4
  - name: OSM_PASSWORD
    value: testpwd

stages:
- stage: Build
  displayName: Build image
  jobs:
  - job: Build
    displayName: Build
    pool:
      vmImage: Ubuntu-latest
    steps:
    - task: AzureCLI@2
      inputs:
        azureSubscription: 'ARMConn1'
        scriptType: 'bash'
        scriptLocation: 'inlineScript'
        inlineScript: |
          docker system prune --all --force
          docker build --file $(Build.SourcesDirectory)/src/Services/PostGIS/Dockerfile --no-cache --tag $(image_tag) --build-arg POSTGIS_TAG=$(POSTGIS_TAG) --build-arg OSM_PASSWORD=$(OSM_PASSWORD) --build-arg DOWNLOAD_URLS="$(DownloadUrls)" $(Build.SourcesDirectory)/src/

管道中的输出:

© www.soinside.com 2019 - 2024. All rights reserved.