为 kontur 人口数据选择选定边界框内的多边形

问题描述 投票:0回答:1

我有来自 https://data.humdata.org/dataset/kontur-population-dataset? 的数据,我想选择边界框内的多边形。例如,我定义边界框如下(柏林周围的边界框):

bounding_box = {
    "min_lon": 12.927785560520098,
    "max_lon": 13.94903874307725,
    "min_lat": 52.284843473119714,
    "max_lat": 52.77168805093944
} 

我尝试按照以下方式执行此操作,但它返回一个空文件:

import geopandas as gpd
from shapely.geometry import box
data = gpd.read_file('C:/Users/Julian/Downloads/kontur_population.gpkg/kontur_population.gpkg')
bounding_box = {
    "min_lon": 12.927785560520098,
    "max_lon": 13.94903874307725,
    "min_lat": 52.284843473119714,
    "max_lat": 52.77168805093944
}
min_lon = bounding_box["min_lon"]
max_lon = bounding_box["max_lon"]
min_lat = bounding_box["min_lat"]
max_lat = bounding_box["max_lat"]

bbox = box(min_lon, min_lat, max_lon, max_lat)
clipped_data = data[data.geometry.within(bbox)]
clipped_data.to_file('C:/Users/Julian/Downloads/kontur_population.gpkg/kontur_population.gpkg', driver="GPKG")

如果您在此事上提供任何帮助,我将不胜感激!

python polygon geopandas shapely bounding-box
1个回答
0
投票

您的脚本有 2 个错误:

  • 数据位于 crs 3857(WGS 84 / Pseudo-Mercator)中,而不是 4326(标准 WGS 84)中,并且您的边界框应与数据位于同一投影中
  • 您的 pandas 数据框过滤器丢失了
    .loc[...]

另外,我这样加快了脚本的速度:

  • 我已经在
    read_file
    调用中添加了一个 bbox 过滤器。这将使用 .gpkg 文件中存在的空间索引粗略地过滤数据,从而大大加快读取速度。
  • 我添加了
    engine="pyogrio"
    用于阅读和写作。一般来说,“pyogrio”引擎比默认引擎快很多。在 geopandas 的下一个版本中,这将成为默认值。

更正的脚本:

from pathlib import Path
import geopandas as gpd
from pyproj import Transformer
from shapely.geometry import box, Point
from shapely.ops import transform

bounding_box = {
    "min_lon": 12.927785560520098,
    "max_lon": 13.94903874307725,
    "min_lat": 52.284843473119714,
    "max_lat": 52.77168805093944,
}

min_lon = bounding_box["min_lon"]
max_lon = bounding_box["max_lon"]
min_lat = bounding_box["min_lat"]
max_lat = bounding_box["max_lat"]

# Convert bbox to crs of data: epsg:3857
transformer = Transformer.from_crs(4326, 3857)
min_3857 = transform(transformer.transform, Point(min_lat, min_lon))
max_3857 = transform(transformer.transform, Point(max_lat, max_lon))

# dir = Path("C:/Temp/test_population")
dir = Path("C:/Users/Julian/Downloads/kontur_population.gpkg")
data = gpd.read_file(
    dir / "kontur_population_20231101.gpkg",
    bbox=(min_3857.x, min_3857.y, max_3857.x, max_3857.y),
    engine="pyogrio",
)

bbox = box(min_3857.x, min_3857.y, max_3857.x, max_3857.y)
clipped_data = data.loc[data.geometry.within(bbox)]
clipped_data.to_file(
    dir / "kontur_population_within.gpkg",
    driver="GPKG",
    engine="pyogrio",
)
© www.soinside.com 2019 - 2024. All rights reserved.