我有一个很大的 .tif 文件(~24 Gb,超过 RAM 可以存储的数量)
(类似问题:https://rasterio.readthedocs.io/en/latest/topics/masking-by-shapefile.html)
(来自图书馆的方法:https://rasterio.readthedocs.io/en/latest/topics/masking-by-shapefile.html)
但我正在寻找另一种更快的解决方案。也许我应该用零创建 numpy 数组并将多边形区域添加到该数组中(它不应该比使用 rasterio 更好)
这是我用经度和纬度处理多边形的代码。
import rasterio
import geopandas as gpd
from tqdm import tqdm
import os
def generate_mask(raster_path, shape_path, output_path):
gdf = gpd.read_file(shape_path)
with rasterio.open(raster_path) as src:
for num in tqdm(sorted(gdf['class'].unique())):
if os.path.isfile(f'{path}/masks/{num}.tif'):
continue
shapes = gdf.loc[(gdf['class'] == num)]['geometry']
file_name = f'{num}.tif'
for shp in shapes:
bound = rasterio.features.bounds(shp)
bbox = rasterio.windows.from_bounds(*bound, src.transform)
window_transform = rasterio.windows.transform(window=bbox, transform=src.transform)
img_crop = src.read([1,2,3], window=bbox)
print(f'Image read')
mask = rasterio.features.geometry_mask([shp],out_shape=(img_crop.shape[1], img_crop.shape[2]),
transform=window_transform, invert=True)
out_image, out_transform = rasterio.mask.mask(src, shp, crop=True, filled=True)
out_meta = src.meta
print(f'Mask {fn} is done')
with rasterio.open(f'{output_path}{file_name}', "w") as dest:
dest.write(out_image)
print(f'Mask is written')`
这里的问题是,它一次只能处理一个多边形,此外,如果我一次处理所有形状,则制作蒙版的时间超过 3 小时。 (太长了,服务器通常会杀死进程。)
我使用 osgeo.gdal.Rasterize() 解决了问题
(参见文档:https://gdal.org/api/python/osgeo.gdal.html)
它比类似物更好地处理大文件(每个面具约 3-5 分钟)
这是我的代码:
from osgeo import gdal
import os
def gdal_generate_mask(raster_path, shape_path, output_path):
# open the raster layer and get its relevant properties
raster_ds = gdal.Open(raster_path, gdal.GA_ReadOnly)
xSize = raster_ds.RasterXSize
ySize = raster_ds.RasterYSize
geotransform = raster_ds.GetGeoTransform()
projection = raster_ds.GetProjection()
# create the target layer (1 band binary)
gdf = gpd.read_file(shape_path)
for num in tqdm(sorted(gdf['class'].unique())):
driver = gdal.GetDriverByName('GTiff')
target_ds = driver.Create(f'{output_path}{num}.tif', xSize, ySize, bands=1, eType = gdal.GDT_Byte, options = ["COMPRESS=DEFLATE"])
target_ds.SetGeoTransform(geotransform)
target_ds.SetProjection(projection)
# create temp file for vector layer gdal Rasterize
vector_layer = f'/home/username/GeoProj/tmp_shp/{num}.shp'
if not os.path.isfile(vector_layer):
gdf[gdf['class'] == num].to_file(vector_layer)
# rasterize the vector layer into the target one
ds = gdal.Rasterize(target_ds, vector_layer, burnValues=[1])
附言我应该在这里寻求答案:https://gis.stackexchange.com/