在 GeoDataFrame 中查找不重叠的多边形

问题描述 投票:0回答:2

我有一个包含一列 shapely.polygons 的 GeoDataFrame。其中一些是不同的,一些则不是:

In [1]: gdf
Out[2]:
    geometry
1   POLYGON ((1 1, 1 2, 2 2, 2 1, 1 1))
2   POLYGON ((1 3, 1 4, 2 4, 2 3, 1 3))
3   POLYGON ((1 1, 1 2, 2 2, 2 1, 1 1))
4   POLYGON ((3 1, 3 2, 4 2, 4 1, 3 1))
5   POLYGON ((1 3, 1 4, 2 4, 2 3, 1 3))

我只需要找到不同的(不重叠)多边形:

In [1]: gdf_distinct
Out[2]:
    geometry
1   POLYGON ((1 1, 1 2, 2 2, 2 1, 1 1))
2   POLYGON ((1 3, 1 4, 2 4, 2 3, 1 3))
4   POLYGON ((3 1, 3 2, 4 2, 4 1, 3 1))

由于多边形不可散列,我无法在 Pandas 中使用简单的方法:

In [1]: gdf_distinct = gdf['geometry'].unique()

TypeError: unhashable type: 'Polygon'

是否有任何简单有效的方法来获得仅包含不同多边形的新 GeoDataFrame?

附注:

我找到了一种方法,但它仅适用于完全重复的多边形,而且我认为效率不是很高:

In [1]: m = []
        for index, row in gdf.iterrows():]
           if row['geometry'] not in m:
              m.append(row['geometry'])
        gdf_distinct = GeoDataFrame(geometry=m)
python shapely geopandas
2个回答
5
投票

让我们从 4 个多边形的列表开始,其中三个与其他多边形重叠:

from shapely.geometry import Polygon
import geopandas

polygons = [
    Polygon([[1, 1], [1, 3], [3, 3], [3, 1], [1, 1]]),
    Polygon([[1, 3], [1, 5], [3, 5], [3, 3], [1, 3]]),
    Polygon([[2, 2], [2, 3.5], [3.5, 3.5], [3.5, 2], [2, 2]]),
    Polygon([[3, 1], [3, 2], [4, 2], [4, 1], [3, 1]]),
]
gdf = geopandas.GeoDataFrame(data={'A': list('ABCD')}, geometry=polygons)
gdf.plot(column='A', alpha=0.75)

它们看起来像这样:

因此,我们可以循环遍历每个,然后循环遍历所有其他,并使用

shapely
API 检查是否有重叠。如果没有任何重叠,我们会将其附加到我们的输出列表中:

non_overlapping = []
for p in polygons:
    overlaps = []
    for g in filter(lambda g: not g.equals(p), polygons):
        overlaps.append(g.overlaps(p))

    if not any(overlaps):
        non_overlapping.append(p)

任何给我的:

['POLYGON ((3 1, 3 2, 4 2, 4 1, 3 1))']

这正是我所期望的。

但这实际上是 O(N^2),而且我认为不一定如此。

所以我们尽量不要两次检查同一对:

non_overlapping = []
for n, p in enumerate(polygons[:-1], 1):  # don't include the last element
    overlaps = []
    for g in polygons[n:]:  # loop from the next element to the end
        overlaps.append(g.overlaps(p))

    if not any(overlaps):
        non_overlapping.append(str(p))

我得到了相同的结果,而且在我的机器上速度更快了一点。

我们可以通过在

if
语句中使用生成器而不是普通的
for
块来稍微压缩循环:

non_overlapping = []
for n, p in enumerate(polygons[:-1], 1):
    if not any(p.overlaps(g) for g in polygons[n:]):
        non_overlapping.append(p)

同样的故事。


0
投票

感谢@Paul H的精彩回答@alphabetasoup的深思熟虑的评论。

虽然我的解决方案没有以不同的方式回答这个问题,但它是相关的。我的用例涉及仅查找重叠的多边形。为此,我做了一个小的代码修改,发现我需要包含最后一个元素,这样我就不会丢失其中一个重叠的多边形。

# Find polygons in a geopandas dataframe that overlap with another polygon 
# in the same dataframe as well as non-overlapping polygons
overlapping = []
non_overlapping = []
for n, p in enumerate(list(gdf.geometry)[:], 1):  # Included the last element
    overlaps = []
    for g in list(gdf.geometry)[n:]:
        overlaps.append(g.overlaps(p))
    if any(overlaps):
        overlapping.append(p)  
    if not any(overlaps):
        non_overlapping.append(p)  # Did not store as string

我的用例还需要保留原始 geopandas 地理数据框中的其他列。我是这样做的:

overlapping = []
non_overlapping = []
for n, p in enumerate(list(gdf.geometry)[:], 0):  # Used Pythonic zero-based indexing
    if any(p.overlaps(g) for g in list(gdf.geometry)[n:]):
        # Store the index from the original dataframe
        overlapping.append(n)
    if not any(p.overlaps(g) for g in list(gdf.geometry)[n:]):
        non_overlapping.append(n)

# Create a new dataframes and reset their indexes
gdf_overlapping = gdf.iloc[overlapping]  
gdf_overlapping.reset_index(drop=True, inplace=True)
gdf_non_overlapping = gdf.iloc[non_overlapping]
gdf_non_overlapping.reset_index(drop=True, inplace=True)
© www.soinside.com 2019 - 2024. All rights reserved.