我有来自 https://data.humdata.org/dataset/kontur-population-dataset? 的数据,我想选择边界框内的多边形。例如,我定义边界框如下(柏林周围的边界框):
bounding_box = {
"min_lon": 12.927785560520098,
"max_lon": 13.94903874307725,
"min_lat": 52.284843473119714,
"max_lat": 52.77168805093944
}
我尝试按照以下方式执行此操作,但它返回一个空文件:
import geopandas as gpd
from shapely.geometry import box
data = gpd.read_file('C:/Users/Julian/Downloads/kontur_population.gpkg/kontur_population.gpkg')
bounding_box = {
"min_lon": 12.927785560520098,
"max_lon": 13.94903874307725,
"min_lat": 52.284843473119714,
"max_lat": 52.77168805093944
}
min_lon = bounding_box["min_lon"]
max_lon = bounding_box["max_lon"]
min_lat = bounding_box["min_lat"]
max_lat = bounding_box["max_lat"]
bbox = box(min_lon, min_lat, max_lon, max_lat)
clipped_data = data[data.geometry.within(bbox)]
clipped_data.to_file('C:/Users/Julian/Downloads/kontur_population.gpkg/kontur_population.gpkg', driver="GPKG")
如果您在此事上提供任何帮助,我将不胜感激!
您的脚本有 2 个错误:
.loc[...]
另外,我这样加快了脚本的速度:
read_file
调用中添加了一个 bbox 过滤器。这将使用 .gpkg 文件中存在的空间索引粗略地过滤数据,从而大大加快读取速度。engine="pyogrio"
用于阅读和写作。一般来说,“pyogrio”引擎比默认引擎快很多。在 geopandas 的下一个版本中,这将成为默认值。更正的脚本:
from pathlib import Path
import geopandas as gpd
from pyproj import Transformer
from shapely.geometry import box, Point
from shapely.ops import transform
bounding_box = {
"min_lon": 12.927785560520098,
"max_lon": 13.94903874307725,
"min_lat": 52.284843473119714,
"max_lat": 52.77168805093944,
}
min_lon = bounding_box["min_lon"]
max_lon = bounding_box["max_lon"]
min_lat = bounding_box["min_lat"]
max_lat = bounding_box["max_lat"]
# Convert bbox to crs of data: epsg:3857
transformer = Transformer.from_crs(4326, 3857)
min_3857 = transform(transformer.transform, Point(min_lat, min_lon))
max_3857 = transform(transformer.transform, Point(max_lat, max_lon))
# dir = Path("C:/Temp/test_population")
dir = Path("C:/Users/Julian/Downloads/kontur_population.gpkg")
data = gpd.read_file(
dir / "kontur_population_20231101.gpkg",
bbox=(min_3857.x, min_3857.y, max_3857.x, max_3857.y),
engine="pyogrio",
)
bbox = box(min_3857.x, min_3857.y, max_3857.x, max_3857.y)
clipped_data = data.loc[data.geometry.within(bbox)]
clipped_data.to_file(
dir / "kontur_population_within.gpkg",
driver="GPKG",
engine="pyogrio",
)