使用 Python 将数据上传到 PostGIS 的问题

问题描述 投票:0回答:0

早上好!

我正在做一个练习,包括从 Open Street Map 下载数据,做一些 ETL 工作并将其上传到 PostGIS 数据库中。下载和 ETL 运行良好,但我在使用 to_postgis 或 to_sql 将数据上传到表中时遇到问题。

问题是我可以使用 if_exists='replace' 上传数据,但是当我尝试使用 if_exists='append' 更新表时它不起作用。此函数仅在我的数据库中不存在表时才有效,因此使用 append 函数运行将创建表并在其中插入数据,但如果我再次运行以添加更多数据,它将不再起作用:(

我会把我正在使用的代码放在这里,对不起,如果有点乱,但我是初学者呵呵

这是我的图书馆,也许有比我需要的更多的东西

import geopandas as gpd
import pandas as pd, json
import requests
import os
import json
import osmnx as osm
import sqlalchemy as sql
import shapely
from geoalchemy2 import Geometry

我的连接参数和其他常量

USERNAME = 'postgres'
PASSWORD = 'postgres'
HOST = 'localhost'
PORT = 5433
DB = 'programming_project'
AMENITY_LIST_POLYGON = ["hospital",'school','university']

LINK_DB = sql.create_engine(f"postgresql://{USERNAME}:{PASSWORD}@{HOST}:{PORT}/{DB}")

这里是我在 Open Street Map 数据中做的一些 ETL,我从里斯本获取一些数据,标签:便利设施和价值进入这个列表 AMENITY_LIST_POLYGON

# ETL amenities polygon

#download facilities from OSM
fac_amenities_pol= osm.geometries_from_place("Lisbon",tags={"amenity":AMENITY_LIST_POLYGON})

#to filter the columns that we want
fac_amenities_pol=fac_amenities_pol[['geometry','amenity','name','addr:postcode','addr:street','email','website','addr:housenumber','phone','contact:phone','contact:email','contact:website']].reset_index()
fac_amenities_pol['email'] = fac_amenities_pol['email'].fillna(fac_amenities_pol.pop('contact:email'))
fac_amenities_pol['phone'] = fac_amenities_pol['phone'].fillna(fac_amenities_pol.pop('contact:phone'))
fac_amenities_pol['website'] = fac_amenities_pol['website'].fillna(fac_amenities_pol.pop('contact:website'))


#to concatenate the address
fac_amenities_pol['address']=fac_amenities_pol['addr:street']+', Nº '+fac_amenities_pol['addr:housenumber']+' - Postal Code: '+fac_amenities_pol['addr:postcode']
fac_amenities_pol= fac_amenities_pol.drop(columns=['addr:street','addr:housenumber','addr:postcode'])

# Rename columns to match the database model
fac_amenities_pol = fac_amenities_pol.rename(columns={
        "addr:street":"address",
        "addr:postcode":"postal_cod",
        "amenity":"facility",
        "phone":"phone_number"})
        
#to filter only the geometry that we want
fac_amenities_pol=fac_amenities_pol.query("element_type != 'node'")

fac_amenities_pol.rename_geometry('geom',inplace=True)

这里开始我的问题。

#to insert the data into database
fac_amenities_pol.to_postgis("facilities",LINK_DB,if_exists='append', index=False, dtype={'geom': Geometry(geometry_type='POLYGON', srid= 4326)})

使用to_postgis,出现的错误是

AttributeError                            Traceback (most recent call last)
File c:\Users\conta\miniconda3\envs\lisbon_engine\Lib\site-packages\sqlalchemy\engine\base.py:1410, in Connection.execute(self, statement, parameters, execution_options)
   1409 try:
-> 1410     meth = statement._execute_on_connection
   1411 except AttributeError as err:

AttributeError: 'str' object has no attribute '_execute_on_connection'

The above exception was the direct cause of the following exception:

ObjectNotExecutableError                  Traceback (most recent call last)
Cell In[159], line 2
      1 #to insert the data into database
----> 2 fac_amenities_pol.to_postgis("facilities11",LINK_DB,if_exists='append', index=False, dtype={'geom': Geometry(geometry_type='POLYGON', srid= 4326)})

File c:\Users\conta\miniconda3\envs\lisbon_engine\Lib\site-packages\geopandas\geodataframe.py:1931, in GeoDataFrame.to_postgis(self, name, con, schema, if_exists, index, index_label, chunksize, dtype)
   1871 def to_postgis(
   1872     self,
   1873     name,
   (...)
   1880     dtype=None,
   1881 ):
   1882     """
   1883     Upload GeoDataFrame into PostGIS database.
...
   1416         distilled_parameters,
   1417         execution_options or NO_OPTIONS,
   1418     )

ObjectNotExecutableError: Not an executable object: "SELECT Find_SRID('public', 'facilities11', 'geom');"

使用 to_sql 错误是不同的,但我找不到任何人的解决方案。

fac_amenities_pol.to_sql("facilities",LINK_DB, if_exists='append', index=False,dtype={'geom': Geometry('POLYGON', srid=4326)})

错误

---------------------------------------------------------------------------
ProgrammingError                          Traceback (most recent call last)
File c:\Users\conta\miniconda3\envs\lisbon_engine\Lib\site-packages\sqlalchemy\engine\base.py:2100, in Connection._exec_insertmany_context(self, dialect, context)
   2099     else:
-> 2100         dialect.do_execute(cursor, sub_stmt, sub_params, context)
   2102 except BaseException as e:

File c:\Users\conta\miniconda3\envs\lisbon_engine\Lib\site-packages\sqlalchemy\engine\default.py:747, in DefaultDialect.do_execute(self, cursor, statement, parameters, context)
    746 def do_execute(self, cursor, statement, parameters, context=None):
--> 747     cursor.execute(statement, parameters)

ProgrammingError: can't adapt type 'Polygon'

The above exception was the direct cause of the following exception:

ProgrammingError                          Traceback (most recent call last)
Cell In[160], line 3
----> 3 fac_amenities_pol.to_sql("facilities",LINK_DB, if_exists='append', index=False,dtype={'geom': Geometry('POLYGON', srid=4326)})

File c:\Users\conta\miniconda3\envs\lisbon_engine\Lib\site-packages\pandas\core\generic.py:2987, in NDFrame.to_sql(self, name, con, schema, if_exists, index, index_label, chunksize, dtype, method)
   2830 """
   2831 Write records stored in a DataFrame to a SQL database.
   2832 

我知道 to_sql 不识别几何多边形,但我也尝试使用点或将多边形分解为多边形但不起作用。 to_postgis 不识别 SRID,但正如我之前所说,如果表不存在,它将创建具有正确 SRID (4326) 的表,但如果我尝试附加新数据将不起作用。

我真的不知道发生了什么,我尝试了很多在这里寻找解决方案,但看起来以前没有人遇到过这些问题。

python postgis geopandas
© www.soinside.com 2019 - 2024. All rights reserved.