早上好!
我正在做一个练习,包括从 Open Street Map 下载数据,做一些 ETL 工作并将其上传到 PostGIS 数据库中。下载和 ETL 运行良好,但我在使用 to_postgis 或 to_sql 将数据上传到表中时遇到问题。
问题是我可以使用 if_exists='replace' 上传数据,但是当我尝试使用 if_exists='append' 更新表时它不起作用。此函数仅在我的数据库中不存在表时才有效,因此使用 append 函数运行将创建表并在其中插入数据,但如果我再次运行以添加更多数据,它将不再起作用:(
我会把我正在使用的代码放在这里,对不起,如果有点乱,但我是初学者呵呵
这是我的图书馆,也许有比我需要的更多的东西
import geopandas as gpd
import pandas as pd, json
import requests
import os
import json
import osmnx as osm
import sqlalchemy as sql
import shapely
from geoalchemy2 import Geometry
我的连接参数和其他常量
USERNAME = 'postgres'
PASSWORD = 'postgres'
HOST = 'localhost'
PORT = 5433
DB = 'programming_project'
AMENITY_LIST_POLYGON = ["hospital",'school','university']
LINK_DB = sql.create_engine(f"postgresql://{USERNAME}:{PASSWORD}@{HOST}:{PORT}/{DB}")
这里是我在 Open Street Map 数据中做的一些 ETL,我从里斯本获取一些数据,标签:便利设施和价值进入这个列表 AMENITY_LIST_POLYGON
# ETL amenities polygon
#download facilities from OSM
fac_amenities_pol= osm.geometries_from_place("Lisbon",tags={"amenity":AMENITY_LIST_POLYGON})
#to filter the columns that we want
fac_amenities_pol=fac_amenities_pol[['geometry','amenity','name','addr:postcode','addr:street','email','website','addr:housenumber','phone','contact:phone','contact:email','contact:website']].reset_index()
fac_amenities_pol['email'] = fac_amenities_pol['email'].fillna(fac_amenities_pol.pop('contact:email'))
fac_amenities_pol['phone'] = fac_amenities_pol['phone'].fillna(fac_amenities_pol.pop('contact:phone'))
fac_amenities_pol['website'] = fac_amenities_pol['website'].fillna(fac_amenities_pol.pop('contact:website'))
#to concatenate the address
fac_amenities_pol['address']=fac_amenities_pol['addr:street']+', Nº '+fac_amenities_pol['addr:housenumber']+' - Postal Code: '+fac_amenities_pol['addr:postcode']
fac_amenities_pol= fac_amenities_pol.drop(columns=['addr:street','addr:housenumber','addr:postcode'])
# Rename columns to match the database model
fac_amenities_pol = fac_amenities_pol.rename(columns={
"addr:street":"address",
"addr:postcode":"postal_cod",
"amenity":"facility",
"phone":"phone_number"})
#to filter only the geometry that we want
fac_amenities_pol=fac_amenities_pol.query("element_type != 'node'")
fac_amenities_pol.rename_geometry('geom',inplace=True)
这里开始我的问题。
#to insert the data into database
fac_amenities_pol.to_postgis("facilities",LINK_DB,if_exists='append', index=False, dtype={'geom': Geometry(geometry_type='POLYGON', srid= 4326)})
使用to_postgis,出现的错误是
AttributeError Traceback (most recent call last)
File c:\Users\conta\miniconda3\envs\lisbon_engine\Lib\site-packages\sqlalchemy\engine\base.py:1410, in Connection.execute(self, statement, parameters, execution_options)
1409 try:
-> 1410 meth = statement._execute_on_connection
1411 except AttributeError as err:
AttributeError: 'str' object has no attribute '_execute_on_connection'
The above exception was the direct cause of the following exception:
ObjectNotExecutableError Traceback (most recent call last)
Cell In[159], line 2
1 #to insert the data into database
----> 2 fac_amenities_pol.to_postgis("facilities11",LINK_DB,if_exists='append', index=False, dtype={'geom': Geometry(geometry_type='POLYGON', srid= 4326)})
File c:\Users\conta\miniconda3\envs\lisbon_engine\Lib\site-packages\geopandas\geodataframe.py:1931, in GeoDataFrame.to_postgis(self, name, con, schema, if_exists, index, index_label, chunksize, dtype)
1871 def to_postgis(
1872 self,
1873 name,
(...)
1880 dtype=None,
1881 ):
1882 """
1883 Upload GeoDataFrame into PostGIS database.
...
1416 distilled_parameters,
1417 execution_options or NO_OPTIONS,
1418 )
ObjectNotExecutableError: Not an executable object: "SELECT Find_SRID('public', 'facilities11', 'geom');"
使用 to_sql 错误是不同的,但我找不到任何人的解决方案。
fac_amenities_pol.to_sql("facilities",LINK_DB, if_exists='append', index=False,dtype={'geom': Geometry('POLYGON', srid=4326)})
错误
---------------------------------------------------------------------------
ProgrammingError Traceback (most recent call last)
File c:\Users\conta\miniconda3\envs\lisbon_engine\Lib\site-packages\sqlalchemy\engine\base.py:2100, in Connection._exec_insertmany_context(self, dialect, context)
2099 else:
-> 2100 dialect.do_execute(cursor, sub_stmt, sub_params, context)
2102 except BaseException as e:
File c:\Users\conta\miniconda3\envs\lisbon_engine\Lib\site-packages\sqlalchemy\engine\default.py:747, in DefaultDialect.do_execute(self, cursor, statement, parameters, context)
746 def do_execute(self, cursor, statement, parameters, context=None):
--> 747 cursor.execute(statement, parameters)
ProgrammingError: can't adapt type 'Polygon'
The above exception was the direct cause of the following exception:
ProgrammingError Traceback (most recent call last)
Cell In[160], line 3
----> 3 fac_amenities_pol.to_sql("facilities",LINK_DB, if_exists='append', index=False,dtype={'geom': Geometry('POLYGON', srid=4326)})
File c:\Users\conta\miniconda3\envs\lisbon_engine\Lib\site-packages\pandas\core\generic.py:2987, in NDFrame.to_sql(self, name, con, schema, if_exists, index, index_label, chunksize, dtype, method)
2830 """
2831 Write records stored in a DataFrame to a SQL database.
2832
我知道 to_sql 不识别几何多边形,但我也尝试使用点或将多边形分解为多边形但不起作用。 to_postgis 不识别 SRID,但正如我之前所说,如果表不存在,它将创建具有正确 SRID (4326) 的表,但如果我尝试附加新数据将不起作用。
我真的不知道发生了什么,我尝试了很多在这里寻找解决方案,但看起来以前没有人遇到过这些问题。