寻求在 3D 网格数据集上处理 1d 线性插值

Question

这是先前问题的后续问题：Implementing 1D interpolation on a 3D Array in Numpy or Xarray

Tsoil 是一个 3D xarray 数据集，具有以下维度：

<xarray.DataArray 'Tsoil' (lat: 1200, lon: 7200, depth: 4)>
dask.array<xarray-<this-array>, shape=(1200, 7200, 4), dtype=float32, chunksize=(1200, 7200, 4), chunktype=numpy.ndarray>
Coordinates:
  * lat      (lat) float64 30.03 30.08 30.13 30.18 ... 89.83 89.88 89.93 89.98
  * lon      (lon) float64 -180.0 -179.9 -179.9 -179.8 ... 179.9 179.9 180.0
  * depth    (depth) float64 3.5 17.5 64.0 194.5

这对应于沿纬度/经度网格不同深度的土壤温度，因此最里面的维度恰好是 T 土壤。我想对 Tsoil 进行插值，这样我就可以在 0 厘米到 289 厘米之间每 0.5 厘米获得更精细的土壤温度分辨率，而不是 4 个深度的温度测量。这需要我对 XArray 的最内层维度执行一维插值。

我定义了一个函数，其输入是来自 xarray 数据集的块：

import numpy as np
import pandas as pd
import xarray as xr
import scipy
import dask.array as da

def interp1d_chunk(chunk, new_depths=new_depths, depths=depths):
    print(chunk)    
    nlat, nlon, _ = chunk.shape
    
    new_chunk = np.empty((nlat, nlon, len(new_depths)))
    
    for i in range(nlat):
        for j in range(nlon):
            f = scipy.interpolate.interp1d(depths,chunk[i, j, :],bounds_error=False,fill_value="extrapolate")
            new_chunk[i, j, :] = f(new_depths)
            print(new_chunk)    
            return new_chunk

假设 Tsoil（出于此处的目的）是一个大小为 (4,4,4) 的 3d 数组：

    test_array = np.asarray([[[ 9.984375  ,  9.315826  ,  6.753296  , -0.71829224],
    [ 9.812378  ,  9.15155   ,  6.6003723 , -0.7545166 ],
    [ 9.925293  ,  9.266205  ,  6.729767  , -0.67056274],
    [ 9.925293  ,  9.266205  ,  6.729767  , -0.67056274]],

   [[10.201508  ,  9.524597  ,  6.9415283 , -0.6557312 ],
    [ 9.812378  ,  9.15155   ,  6.6003723 , -0.7545166 ],
    [10.083374  ,  9.394531  ,  6.7566833 , -0.7775879 ],
    [ 9.925293  ,  9.266205  ,  6.729767  , -0.67056274]],

   [[10.201508  ,  9.524597  ,  6.9415283 , -0.6557312 ],
    [ 9.812378  ,  9.15155   ,  6.6003723 , -0.7545166 ],
    [10.083374  ,  9.394531  ,  6.7566833 , -0.7775879 ],
    [ 9.925293  ,  9.266205  ,  6.729767  , -0.67056274]],

   [[10.201508  ,  9.524597  ,  6.9415283 , -0.6557312 ],
    [10.109436  ,  9.4236145 ,  6.80542   , -0.7413635 ],
    [10.083374  ,  9.394531  ,  6.7566833 , -0.7775879 ],
    [10.083374  ,  9.394531  ,  6.7566833 , -0.7775879 ]]],dtype=np.float32)

    test_lats = [60.275, 60.325, 60.375, 60.425]
    test_lons = [140.75, 140.8, 140.85, 140.9]

    depths=np.asarray([3.5,17.5,64,194.5])
    new_depths = np.arange(0,289.1,0.5)

我将创建一个名为 test_stemp 的虚拟 XArray：

    test_stemp = xr.DataArray(test_array,'lat':test_lats,'lon':test_lons,'depth'
                                   coords={'lat':lats,'lon':lons,'depth':depths}, 
                                   dims=['lat','lon','depth']).rename('Tsoil').chunk(chunks={lat_var:len(test_lats),lon_var:len(test_lons),'depth':4})

然后我尝试通过 interp1d_chunk 运行它，并沿 new_depths 进行插值，但似乎该函数传递的块大小为零：

stemp_interp = da.map_blocks(interp1d_chunk, chunk=test_stemp, new_depths=new_depths, depths=depths, dtype='float',chunks=(len(test_lats),len(test_lons),len(new_depths)))

<xarray.DataArray (lat: 0, lon: 0, depth: 0)>
array([], shape=(0, 0, 0), dtype=float32)
Coordinates:
  * lat      (lat) float64 
  * lon      (lon) float64 
  * depth    (depth) float64

有人对我的函数或程序可能有什么问题有什么建议吗？

Answer 1

这是预期的行为。

请注意，
map_blocks
将尝试通过在输入的 0 维版本上调用
func
来自动确定输出数组类型。如果您预计该函数在 0 维数组上操作时不会成功，请参阅下面的
meta
关键字参数。

https://docs.dask.org/en/stable/ generated/dask.array.map_blocks.html

由于 meta 未定义，它会使用 (0, 0, 0) 数组调用您的函数。您可以定义元，或者您可以在这种情况下更改返回值，以便即使前两个轴的长度为零，它也返回一个数组。

def interp1d_chunk(chunk, new_depths, depths):
    print(chunk)
    nlat, nlon, _ = chunk.shape
    
    new_chunk = np.zeros((nlat, nlon, len(new_depths)))
    
    for i in range(nlat):
        for j in range(nlon):
            f = scipy.interpolate.interp1d(depths,chunk[i, j, :],bounds_error=False,fill_value="extrapolate")
            new_chunk[i, j, :] = f(new_depths)
            print(new_chunk)    
    return new_chunk

请注意，我未缩进

return new_chunk

两个缩进。我认为为了正确性，这也是一个很好的改变 - 在仅插入一组深度值后返回 new_chunk 是没有意义的。

我还将

np.empty()

更改为

np.zeros()

。虽然

np.empty()

可以更快，但它也会导致您使用未初始化的内存。通常不值得为获得不确定的结果而烦恼。

最后，请注意此计算是惰性计算。在您调用

stemp_interp.compute()

之前，它实际上不会进行插值。

寻求在 3D 网格数据集上处理 1d 线性插值

问题描述投票：0回答：1

1个回答

最新问题

寻求在 3D 网格数据集上处理 1d 线性插值

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1