如何在使用dask打开的xarray数据集中使用一维函数metpy.parcel_profile

问题描述 投票:0回答:0

我无法在 4D 的 xarray 数据集(用 dask 分段)上计算 Parcel_profile(1D 函数)。

你好,

我真的需要帮助,我正在研究有关压力水平的 ERA5 每小时数据。我提取了几个大气压水平下的相对湿度和温度。我使用 Metpy 的“dewpoint_from_relative_humidity”函数来计算露点。该表相当大,因此我使用 dask 创建多个块。我有下表:

import numpy as np
import metpy as mp
import metpy.calc as mpcalc
import xarray as xr
import metpy.units as mpunit

with dask.config.set(**{'array.slicing.split_large_chunks': True}):
    ncin_1 = xr.open_dataset(ncfname_1, engine='netcdf4').chunk('auto')
    ncin_1['t'] = ncin_1['t'] - 273.15
    ncin_1['r'] = ncin_1['r'] / 100.0
    ncin_1['r'] = ncin_1['r'].clip(min=0.01, max=1)

# variable reprocessing with metpy unit registry
ncin_1['level'].attrs['units'] = 'hPa'
ncin_1 = ncin_1.metpy.quantify()
ncin_1['time']  = ncin_1['time'].metpy.time 
ncin_1['latitude']  = ncin_1['latitude'].metpy.latitude
ncin_1['longitude']  = ncin_1['longitude'].metpy.longitude
ncin_1['level']  = ncin_1['level'].metpy.vertical
ncin_1['t'] = ncin_1['t'] * mpunit.units.degC
ncin_1['dewpoint'] = ncin_1['dewpoint'] * mpunit.units.degC
ncin_1['r'] = ncin_1['r'] * mpunit.units.dimensionless

# pressures levels in descending order
ncin_1= ncin_1.where((ncin_1['level'] >= 100) & (test['level'] <= 1000), drop=True)
ncin_1=  ncin_1.sortby('level', ascending=False)

# create a profile variable to be filled in
ncin_1['profil'] = (('time', 'level', 'latitude', 'longitude'),np.full_like(ncin_1['t'], fill_value=np.nan))
<xarray.Dataset>
Dimensions:    (longitude: 60, latitude: 41, level: 21, time: 24836)
Coordinates:
  * longitude  (longitude) float32 -5.02 -4.77 -4.52 -4.27 ... 9.23 9.48 9.73
  * latitude   (latitude) float32 51.15 50.9 50.65 50.4 ... 41.65 41.4 41.15
  * level      (level) int32 20 50 100 150 200 250 ... 750 800 850 900 950 1000
  * time       (time) datetime64[ns] 2006-01-01 ... 2022-12-31T18:00:00
Data variables:
    r          (time, level, latitude, longitude) float32 0.03323 ... 0.7702
    t          (time, level, latitude, longitude) float32 -69.07 ... 14.25
    dewpoint   (time, level, latitude, longitude) float32 -90.23 ... 10.28
    profil     (time, level, latitude, longitude) float32 nan nan ... nan nan
Attributes:
    Conventions:  CF-1.6
    history:      2023-07-19 22:32:41 GMT by grib_to_netcdf-2.25.1: /opt/ecmw... 

我想计算提升索引(使用metpy的lifted_index函数),但首先我需要计算parcel_profile变量。根据文档,问题在于该函数是一维函数。我使用 xarray.apply_ufunc 或 xarray.map_blocks 制作了几个脚本。

  1. 惠特 xarray.apply_ufunc :
def wrapper_parcel_profile(pressure, temperature, dewpoint):
    return mpcalc.parcel_profile(pressure * units.hPa , temperature * units.degC , dewpoint * units.degC ).to('degC')


t_1000 = ncin_1['t'].metpy.sel(level=1000)
dewpoint_1000 = ncin_1['dewpoint'].metpy.sel(level=1000) 
pressure = ncin_1['level'] 

ncin_1['profil'] = xr.apply_ufunc(
    wrapper_parcel_profile,  
    pressure, t_1000, dewpoint_1000,  
    input_core_dims=[['level'], ['time', 'latitude', 'longitude'], ['time', 'latitude', 'longitude']], 
    output_core_dims=[['time', 'level' , 'latitude', 'longitude']],
    vectorize=True,
    dask='parallelized', 
    output_dtypes=[float],
    dask_gufunc_kwargs={'allow_rechunk': True}
)

这个脚本运行是因为我使用的是dask,所以如果我理解正确的话,只要我不运行ncin_1.compute()命令,就不会直接计算任何东西。我收到此错误消息:

ValueError: operands could not be broadcast together with shapes (19,) (24836,41,60) 
(一定和ncin_1['level']有关系?)

  1. 与 xarray.map_block :
def wrapper_parcel_profile(pressure, temperature, dewpoint):
    return mpcalc.parcel_profile( pressure * units.hPa , temperature * units.degC , dewpoint * units.degC).to('degC')

pressure = test['level']
t_1000 = test['t'].metpy.sel(level=1000) * units.degC
dewpoint_1000 = test['dewpoint'].metpy.sel(level=1000) * units.degC
test['profil'] = xr.map_blocks(wrapper_parcel_profile, test ,template= test['t'])

当我在数据库中使用“ncin_1.compute()”时收到此错误消息:

TypeError: Mixing chunked array types is not supported, but received multiple types: {, }

我的方法正确吗?是否可以简单地通过保留在 xarray 数据集中来做到这一点?我找到的解决方案合适吗?预先感谢您的帮助

python dask python-xarray metpy era5
© www.soinside.com 2019 - 2024. All rights reserved.