重新取样xarray反对较低分辨率的空间

问题描述 投票:1回答:3

使用xarray重新取样,以较低的空间分辨率

我想我的重采样对象xarray到较低的空间分辨率(更少的像素)。

import pandas as pd
import numpy as np
import xarray as xr

time = pd.date_range(np.datetime64('1998-01-02T00:00:00.000000000'), np.datetime64('2005-12-28T00:00:00.000000000'), freq='8D')
x = np.arange(1200)
y = np.arange(1200)

latitude = np.linspace(40,50,1200)
longitude = np.linspace(0,15.5572382,1200)

latitude, longitude = np.meshgrid(latitude, longitude)

BHR_SW = np.ones((365, 1200, 1200))

output_da = xr.DataArray(BHR_SW, coords=[time, y, x])
latitude_da = xr.DataArray(latitude, coords=[y, x])
longitude_da = xr.DataArray(longitude, coords=[y, x])

output_da = output_da.rename({'dim_0':'time','dim_1':'y','dim_2':'x'})
latitude_da = latitude_da.rename({'dim_0':'y','dim_1':'x'})
longitude_da = longitude_da.rename({'dim_0':'y','dim_1':'x'})

output_ds = output_da.to_dataset(name='BHR_SW')
output_ds = output_ds.assign({'latitude':latitude_da, 'longitude':longitude_da})

print(output_ds)

<xarray.Dataset>
Dimensions:    (time: 365, x: 1200, y: 1200)
Coordinates:
  * time       (time) datetime64[ns] 1998-01-02 1998-01-10 ... 2005-12-23
  * y          (y) int64 0 1 2 3 4 5 6 7 ... 1193 1194 1195 1196 1197 1198 1199
  * x          (x) int64 0 1 2 3 4 5 6 7 ... 1193 1194 1195 1196 1197 1198 1199
Data variables:
    BHR_SW     (time, y, x) float64 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0 1.0
    latitude   (y, x) float64 40.0 40.01 40.02 40.03 ... 49.97 49.98 49.99 50.0
    longitude  (y, x) float64 0.0 0.0 0.0 0.0 0.0 ... 15.56 15.56 15.56 15.56
```

My question is, how to I resample the following by the x,y coordinates to a 200x200 grid?

这是一个减小可变的空间分辨率。

我曾尝试如下:

output_ds.resample(x=200).mean()

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-54-10fbdf855a5d> in <module>()
----> 1 output_ds.resample(x=200).mean()

/home/mpim/m300690/miniconda3/envs/holaps/lib/python2.7/site-packages/xarray/core/common.pyc in resample(self, indexer, skipna, closed, label, base, keep_attrs, **indexer_kwargs)
    701         group = DataArray(dim_coord, coords=dim_coord.coords,
    702                           dims=dim_coord.dims, name=RESAMPLE_DIM)
--> 703         grouper = pd.Grouper(freq=freq, closed=closed, label=label, base=base)
    704         resampler = self._resample_cls(self, group=group, dim=dim_name,
    705                                        grouper=grouper,

/home/mpim/m300690/miniconda3/envs/holaps/lib/python2.7/site-packages/pandas/core/resample.pyc in __init__(self, freq, closed, label, how, axis, fill_method, limit, loffset, kind, convention, base, **kwargs)
   1198                              .format(convention))
   1199
-> 1200         freq = to_offset(freq)
   1201
   1202         end_types = set(['M', 'A', 'Q', 'BM', 'BA', 'BQ', 'W'])

/home/mpim/m300690/miniconda3/envs/holaps/lib/python2.7/site-packages/pandas/tseries/frequencies.pyc in to_offset(freq)
    174                     delta = delta + offset
    175         except Exception:
--> 176             raise ValueError(libfreqs._INVALID_FREQ_ERROR.format(freq))
    177
    178     if delta is None:

ValueError: Invalid frequency: 200

但我得到所示的错误。

我怎样才能完成x和y这个空间重采样?

理想我想这样做:

output_ds.resample(x=200, y=200).mean()

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-55-e0bfce19e037> in <module>()
----> 1 output_ds.resample(x=200, y=200).mean()

/home/mpim/m300690/miniconda3/envs/holaps/lib/python2.7/site-packages/xarray/core/common.pyc in resample(self, indexer, skipna, closed, label, base, keep_attrs, **indexer_kwargs)
    679         if len(indexer) != 1:
    680             raise ValueError(
--> 681                 "Resampling only supported along single dimensions."
    682             )
    683         dim, freq = indexer.popitem()

ValueError: Resampling only supported along single dimensions.

注:实际数据有不同的行为

这对测试数据上面我已经创建。从NetCDF文件中读取的真实数据

<xarray.Dataset>
Dimensions:    (time: 368, x: 1200, y: 1200)
Coordinates:
  * time       (time) datetime64[ns] 1998-01-02 1998-01-10 ... 2005-12-28
Dimensions without coordinates: x, y
Data variables:
    latitude   (y, x) float32 ...
    longitude  (y, x) float32 ...
    Data_Mask  (y, x) float32 ...
    BHR_SW     (time, y, x) float32 ...
Attributes:
    CDI:               Climate Data Interface version 1.9.5 (http://mpimet.mp...
    Conventions:       CF-1.4
    history:           Fri Dec 07 13:29:13 2018: cdo mergetime GLOBALBEDO/Glo...
    content:           extracted variabel BHR_SW of the original GlobAlbedo (...
    metadata_profile:  beam
    metadata_version:  0.5
    CDO:               Climate Data Operators version 1.9.5 (http://mpimet.mp...
```

我已经尝试了类似的事情:

ds.resample(x=200).mean()

/home/mpim/m300690/miniconda3/envs/holaps/lib/python2.7/site-packages/xarray/core/common.pyc in resample(self, indexer, skipna, closed, label, base, keep_attrs, **indexer_kwargs)
    686         dim_coord = self[dim]
    687
--> 688         if isinstance(self.indexes[dim_name], CFTimeIndex):
    689             raise NotImplementedError(
    690                 'Resample is currently not supported along a dimension '

/home/mpim/m300690/miniconda3/envs/holaps/lib/python2.7/site-packages/xarray/core/coordinates.pyc in __getitem__(self, key)
    309         if key not in self._sizes:
    310             raise KeyError(key)
--> 311         return self._variables[key].to_index()
    312
    313     def __unicode__(self):

KeyError: 'x'

任何帮助非常感谢。

python netcdf resampling python-xarray
3个回答
4
投票

作为piman314表明,GROUPBY是xarray做到这一点的唯一方法。重新取样只能用于日期时间坐标。

由于目前xarray不处理多维GROUPBY,这有两个阶段来完成:

# this results in bin centers on 100, 300, ...
reduced = (
    output_ds
    .groupby(((output_ds.x//200) + 0.5) * 200)
    .mean(dim='x')
    .groupby(((output_ds.y//200) + 0.5) * 200)
    .mean(dim='y'))

如果你只是想下采样数据,您可以使用位置切片:

output_ds[:, ::200, ::200]

或者,使用命名DIMS:

output_ds[{'x': slice(None, None, 200), 'y': slice(None, None, 200)}]

最后,还有一些其他的包在那里,是专门为xarray兼容快速重新网格化设计。 xESMF是一个很好的一个。


2
投票

要使用xarray最明显的方式做到这一点是使用groupby_bins,但事实证明这是非常缓慢的。这可能更effecient落入numpy和使用超快的索引([:, :, frequency]

nsamples = 200
bins = np.linspace(output_ds.x.min(),
                   output_ds.x.max(), nsamples).astype(int)
output_ds = output_ds.groupby_bins('x', bins).first()

0
投票

正如你所使用的已用的CDO操纵的NetCDF文件,你也可以使用任意的CDO SAMPLEGRID功能或士官bilinear_interp功能:

SAMPLEGRIDhttps://code.mpimet.mpg.de/projects/cdo/embedded/cdo.pdf)不进行内插,它只是删除了每个第n个网格点。

bilinear_interphttp://nco.sourceforge.net/nco.html#Bilinear-interpolation)做插值。

正如你可能想要平均值,最大值,无论反照率值,你可能会喜欢军士bilinear_interp。但是,债务抵押债券SAMPLEGRID可以给你你需要的国家奥委会grid_outbilinear_interp

© www.soinside.com 2019 - 2024. All rights reserved.