合并具有相同范围但空间分辨率不同的xarray数据集

问题描述 投票:2回答:1

我有一个基于卫星的太阳诱发荧光(SIF)数据集和一个模拟降水量的数据集。我想将研究区域中每个像素的降水量与SIF进行比较。我的两个数据集的面积相同,但空间分辨率略有不同。当我取整个区域的平均值时,我可以成功绘制这些值的时间并相互比较,但是我正在努力创建每个像素的散点图。

老实说,当不确定沉淀对SIF的影响时,我不确定这是否是比较这两个值的最佳方法,因此我愿意接受不同方法的想法。至于当前合并数据,我正在使用xr.combine_by_coords,但这给了我下面描述的错误。我也可以通过将netcdfs转换为geotiffs,然后使用rasterio使其变形来完成此操作,但这似乎是进行这种比较的一种无效方法。这是我到目前为止的内容:

xr.combine_by_coords

目前,我在最后一行收到一个看似无济于事的import netCDF4 import numpy as np import dask import xarray as xr rainy_bbox = np.array([ [-69.29519955115512,-13.861261028444734], [-69.29519955115512,-12.384786628185896], [-71.19583431678012,-12.384786628185896], [-71.19583431678012,-13.861261028444734]]) max_lon_lat = np.max(rainy_bbox, axis=0) min_lon_lat = np.min(rainy_bbox, axis=0) # this dataset is available here: ftp://fluo.gps.caltech.edu/data/tropomi/gridded/ sif = xr.open_dataset('../data/TROPO_SIF_03-2018.nc') # the dataset is global so subset to my study area in the Amazon rainy_sif_xds = sif.sel(lon=slice(min_lon_lat[0], max_lon_lat[0]), lat=slice(min_lon_lat[1], max_lon_lat[1])) # this data can all be downloaded from NASA Goddard here either manually or with wget but you'll need an account on https://disc.gsfc.nasa.gov/: https://pastebin.com/viZckVdn imerg_xds = xr.open_mfdataset('../data/3B-DAY.MS.MRG.3IMERG.201803*.nc4') # spatial subset rainy_imerg_xds = imerg_xds.sel(lon=slice(min_lon_lat[0], max_lon_lat[0]), lat=slice(min_lon_lat[1], max_lon_lat[1])) # I'm not sure the best way to combine these datasets but am trying this combo_xds = xr.combine_by_coords([rainy_imerg_xds, rainy_xds]) 。当我添加参数RecursionError: maximum recursion depth exceeded in comparison时,来自join='left'数据集的数据将位于rainy_imerg_xds中,当我执行combo_xds时,将出现join='right'数据,如果执行rainy_xds,则将不存在任何数据。我认为此功能有一些内部插值,但似乎没有。

python geospatial netcdf python-xarray
1个回答
0
投票

join='inner'非常简单地概述了该问题的解决方案。 documentation from xarray允许您内插多个尺寸,并指定另一个数据集的x和y尺寸作为输出尺寸。因此,在这种情况下,可以通过

完成
xarray

# interpolation based on http://xarray.pydata.org/en/stable/interpolation.html # interpolation can't be done across the chunked dimension so we have to load it all into memory rainy_sif_xds.load() #interpolate into the higher resolution grid from IMERG interp_rainy_sif_xds = rainy_sif_xds.interp(lat=rainy_imerg_xds["lat"], lon=rainy_imerg_xds["lon"]) # visualize the output rainy_sif_xds.dcSIF.mean(dim='time').hvplot.quadmesh('lon', 'lat', cmap='jet', geo=True, rasterize=True, dynamic=False, width=450).relabel('Initial') +\ interp_rainy_sif_xds.dcSIF.mean(dim='time').hvplot.quadmesh('lon', 'lat', cmap='jet', geo=True, rasterize=True, dynamic=False, width=450).relabel('Interpolated')

enter image description here

# now that our coordinates match, in order to actually merge we need to convert the default CFTimeIndex to datetime to merge dataset with SIF data because the IMERG rainfall dataset was CFTime and the SIF was datetime rainy_imerg_xds['time'] = rainy_imerg_xds.indexes['time'].to_datetimeindex() # now the merge can easily be done with merged_xds = xr.combine_by_coords([rainy_imerg_xds, interp_rainy_sif_xds], coords=['lat', 'lon', 'time'], join="inner") # now visualize the two datasets together // multiply SIF by 30 because values are so ow merged_xds.HQprecipitation.rolling(time=7, center=True).sum().mean(dim=('lat', 'lon')).hvplot().relabel('Precip') * \ (merged_xds.dcSIF.mean(dim=('lat', 'lon'))*30).hvplot().relabel('SIF')

© www.soinside.com 2019 - 2024. All rights reserved.