作为插值数据(点和值),我有一个 2-D 1x1 度纬度、经度网格,每个网格点上都有一个标量变量。网格的大面积由 NaN (~40%) 组成。作为坐标 (xi),我有一个“非结构化”坐标数组,所有这些坐标都在网格边界内,并且坐标很少(<0.1%) are located near NaNs.
通常,在线性插值时,当我将网格数据插值到非结构化坐标时,我会使用
scipy.interpolate.RegularGridInterpolator(points , values, method='linear')
。随后对剩余 0.1% 进行最近邻插值,其中线性插值在 NaN 附近失败。
fc = RegularGridInterpolator(points, values, method='cubic', bounds_error=True)
interped_cubic = fc(xi)
这提高了
"ValueError: Array must not contain infs or nans."
我也尝试过
interp2d
:
fc = interp2d(lon0, lat0, data, kind='cubic', bounds_error=True)
interped_cub = fc(lon, lat)
它返回形状网格(lon.size,lat.size)上的插值,而不是插值到坐标(lon,lat)上。 (有关 lon/lat 和 lon0/lat0 定义的可重现示例)
我还尝试在插值之前删除 NaN:
from scipy.interpolate import interpn
import numpy.ma as ma
mask = np.isnan(data_nan)
lonG,latG = np.meshgrid(lon0,lat0)
lonM = ma.MaskedArray(lonG, mask).compressed()
latM = ma.MaskedArray(latG, mask).compressed()
interped_cub = interpn( (latM, lonM), data_nan[~mask], (lat, lon), method='cubic' )
将网格简化为没有 NaN 的一维数组。插值返回
ValueError: There are 2 point arrays, but values has 1 dimensions
。
我期望
f=RegularGridInterpolator(points, values, method='cubic');f(xi)
在非 NaN 区域中的坐标上返回插值,并在 NaN 区域附近或上的坐标上返回 NaN 值 - 与使用 method='linear'
时函数返回的值相同,只是插值不同-方法。
由于我显然没有正确使用该函数,有人可以解释我的错误在哪里,并希望建议我可以尝试什么,以完成坐标上的三次插值吗?
import numpy as np, matplotlib.pyplot as plt
from scipy.interpolate import RegularGridInterpolator, interp2d
### THE FOLLOWING CODEBLOCK RUNS WITHOUT ERROR:
lat0, lon0 = np.linspace(-60, -40, 20), np.linspace(-50, 0.0, 50)
lonG, latG = np.meshgrid(lon0, lat0)
data = np.sin(latG*lonG) # Data has no NaN- areas
lat = np.random.uniform(low=-60, high=-40, size=2000)
lon = np.random.uniform(low=-50, high=0.0, size=2000)
plt.title('2D data, 1x1 grid');plt.pcolormesh(lon0, lat0, data);plt.show()
fl = RegularGridInterpolator((lat0, lon0), data, method='linear', bounds_error=True)
interped_lin = fl((lat, lon))
plt.title('linear interpolation');plt.scatter(lon, lat,c=interped_lin);plt.show()
fc = RegularGridInterpolator((lat0, lon0), data, method='cubic', bounds_error=True)
interped_cub = fc((lat, lon))
plt.title('cubic interpolation');plt.scatter(lon, lat,c=interped_cub);plt.show()
### THE FOLLOWING CODE RAISES ERROR
# Substitute some of the data with NaNs:
data_nan = data.copy()
data_nan[1:8, 5:12] = np.nan
data_nan[2:6, 12:20] = np.nan
plt.title('2D data w/ NaNs, 1x1 grid');plt.pcolormesh(lon0, lat0, data_nan);plt.show()
fl = RegularGridInterpolator((lat0, lon0), data_nan, method='linear', bounds_error=True)
interped_lin = fl((lat, lon))
plt.title('linear interpolation');plt.scatter(lon, lat,c=interped_lin);plt.show()
fc = RegularGridInterpolator((lat0, lon0), data_nan, method='cubic', bounds_error=True)
interped_cub = fc((lat, lon))
plt.title('cubic interpolation');plt.scatter(lon, lat,c=interped_cub);plt.show()
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[3], line 27
25 plt.title('linear interpolation');plt.scatter(lon, lat,c=interped_lin);plt.show()
26 fc = RegularGridInterpolator((lat0, lon0), data_nan, method='cubic', bounds_error=True)
---> 27 interped_cub = fc((lat, lon))
28 plt.title('cubic interpolation');plt.scatter(lon, lat,c=interped_cub);plt.show()
File /srv/conda/envs/notebook/lib/python3.10/site-packages/scipy/interpolate/_rgi.py:335, in RegularGridInterpolator.__call__(self, xi, method)
333 if is_method_changed:
334 self._validate_grid_dimensions(self.grid, method)
--> 335 result = self._evaluate_spline(self.values.T, xi,
336 self._SPLINE_DEGREE_MAP[method])
338 if not self.bounds_error and self.fill_value is not None:
339 result[out_of_bounds] = self.fill_value
File /srv/conda/envs/notebook/lib/python3.10/site-packages/scipy/interpolate/_rgi.py:388, in RegularGridInterpolator._evaluate_spline(self, values, xi, spline_degree)
381 # Non-stationary procedure: difficult to vectorize this part entirely
382 # into numpy-level operations. Unfortunately this requires explicit
383 # looping over each point in xi.
384
385 # can at least vectorize the first pass across all points in the
386 # last variable of xi.
387 last_dim = n - 1
--> 388 first_values = self._do_spline_fit(self.grid[last_dim],
389 values,
390 xi[:, last_dim],
391 spline_degree)
393 # the rest of the dimensions have to be on a per point-in-xi basis
394 result = np.empty(m, dtype=self.values.dtype)
File /srv/conda/envs/notebook/lib/python3.10/site-packages/scipy/interpolate/_rgi.py:414, in RegularGridInterpolator._do_spline_fit(x, y, pt, k)
412 @staticmethod
413 def _do_spline_fit(x, y, pt, k):
--> 414 local_interp = make_interp_spline(x, y, k=k, axis=0)
415 values = local_interp(pt)
416 return values
File /srv/conda/envs/notebook/lib/python3.10/site-packages/scipy/interpolate/_bsplines.py:1240, in make_interp_spline(x, y, k, t, bc_type, axis, check_finite)
1237 axis = normalize_axis_index(axis, y.ndim)
1239 x = _as_float_array(x, check_finite)
-> 1240 y = _as_float_array(y, check_finite)
1242 y = np.moveaxis(y, axis, 0) # now internally interp axis is zero
1244 # sanity check the input
File /srv/conda/envs/notebook/lib/python3.10/site-packages/scipy/interpolate/_bsplines.py:36, in _as_float_array(x, check_finite)
34 x = x.astype(dtyp, copy=False)
35 if check_finite and not np.isfinite(x).all():
---> 36 raise ValueError("Array must not contain infs or nans.")
37 return x
ValueError: Array must not contain infs or nans.
谢谢!如果我可以改进我的示例,请告诉我。
Scipy 的
RegularGridInterpolator
希望用户为规则网格中的每个点提供数据,因此省略 NaN
点将导致错误。要对 NaN
进行插值,您可以使用非结构化插值器,例如 scipy.interpolate.griddata
,它允许用户仅传递某些点(即您可以删除 NaN
)。请参阅下面的一些示例代码。
import numpy as np
from scipy.interpolate import griddata
nx = 100
ny = 100
x = np.linspace(0, 10, nx)
y = np.linspace(0, 10, ny)
X, Y = np.meshgrid(x, y)
rng = np.random.default_rng(42)
n_nan = 1000
non_nan_indices = rng.choice(np.arange(nx*ny), nx*ny - n_nan)
X_nan = X.flatten()[non_nan_indices]
Y_nan = Y.flatten()[non_nan_indices]
Z_nan = np.cos(X_nan)*np.sin(Y_nan)
Z_interp = griddata((X_nan, Y_nan), Z_nan, (X.flatten(), Y.flatten()), method="cubic")
Z_interp
可以重塑为 (nx, ny)
。