在 Python 中从网络加载 GRIB,而不在本地保存文件

问题描述 投票:0回答:2

我想从网上下载 GRIB 文件:

选项1:https://noaa-gfs-bdp-pds.s3.amazonaws.com/gfs.20210801/12/atmos/gfs.t12z.pgrb2.1p00.f000

选项2:https://www.ncei.noaa.gov/data/global-forecast-system/access/grid-003-1.0- Degree/analysis/202108/20210801/gfs_3_20210801_1200_000.grb2 (仅供参考,数据相同)

这里提出了类似的问题: 如何在不先下载文件的情况下使用 pygrib 打开 GRIB 文件? 然而,最终并没有给出最终的解决方案,因为数据源有多个响应(导致问题)。当我尝试重新创建代码时:

url = 'https://noaa-gfs-bdp-pds.s3.amazonaws.com/gfs.20210801/12/atmos/gfs.t12z.pgrb2.1p00.f000'
urllib.request.install_opener(opener)

req = urllib.request.Request(url,headers={'User-Agent': 'Mozilla/5.0'})
data = urllib.request.urlopen(req, timeout = 300)

import pygrib 
grib = pygrib.fromstring(data.read())

我的错误是不同的:

ECCODES ERROR   :  grib_handle_new_from_message: No final 7777 in message!
Traceback (most recent call last):
  File "grib_data_test.py", line 139, in <module>
    grib = pygrib.fromstring(data.read())
  File "src\pygrib\_pygrib.pyx", line 627, in pygrib._pygrib.fromstring
  File "src\pygrib\_pygrib.pyx", line 1390, in pygrib._pygrib.gribmessage._set_projparams
  File "src\pygrib\_pygrib.pyx", line 1127, in pygrib._pygrib.gribmessage.__getitem__
RuntimeError: b'Key/value not found'

我还尝试直接使用 osgeo.gdal 库(最适合我的项目,因为它已经在项目中使用)。文档:https://gdal.org/user/virtual_file_systems.html#vsimem-in-memory-files

Attempt1: url = "/vsicurl/https://noaa-gfs-bdp-pds.s3.amazonaws.com/gfs.20210801/12/atmos/gfs.t12z.pgrb2.0p50.f000"
Attempt2: url = "/vsis3/https://noaa-gfs-bdp-pds.s3.amazonaws.com/gfs.20210801/12/atmos/gfs.t12z.pgrb2.0p50.f000"
Attempt3: url = "/vsicurl/https://www.ncei.noaa.gov/data/global-forecast-system/access/grid-003-1.0-degree/analysis/202108/20210801/gfs_3_20210801_0000_000.grb2"

grib = gdal.Open(url)

错误:

Attempt1: ERROR 11: CURL error: schannel: next InitializeSecurityContext failed: Unknown error (0x80092012) - The revocation function was unable to check revocation for the certificate.
Attempt2: ERROR 11: HTTP response code: 0
Attempt3: ERROR 4: /vsicurl/https://www.ncei.noaa.gov/data/global-forecast-system/access/grid-003-1.0-degree/analysis/202108/20210801/gfs_3_20210801_0000_000.grb2 is a grib file, but no raster dataset was successfully identified.

对于尝试 1 和 2:我觉得这两种尝试都应该在这里工作 - 通常您不需要任何特定的 AWS 凭证/连接来访问公开可用的 s3 存储桶数据(因为它是使用上面的 urllib 访问的)。

对于Attempt3,这里也有类似的问题: https://gis.stackexchange.com/questions/395867/opening-a-grib-from-the-web-with-gdal-in-python-using-vsicurl-throws-error-on-m 但是我没有遇到同样的问题,我的输出:

gdalinfo /vsicurl/https://www.ncei.noaa.gov/data/global-forecast-system/access/grid-003-1.0-degree/analysis/202108/20210801/gfs_3_20210801_0000_000.grb2 --debug on --config CPL_CURL_VERBOSE YES

HTTP: libcurl/7.78.0 Schannel zlib/1.2.11 libssh2/1.9.0
HTTP: GDAL was built against curl 7.77.0, but is running against 7.78.0.
* Couldn't find host www.ncei.noaa.gov in the (nil) file; using defaults
*   Trying 205.167.25.177:443...
* Connected to www.ncei.noaa.gov (205.167.25.177) port 443 (#0)
* schannel: disabled automatic use of client certificate
> HEAD /data/global-forecast-system/access/grid-003-1.0-degree/analysis/202108/20210801/gfs_3_20210801_0000_000.grb2 HTTP/1.1
Host: www.ncei.noaa.gov
Accept: */*

* schannel: server closed the connection
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Date: Mon, 20 Sep 2021 14:38:59 GMT
< Server: Apache
< Strict-Transport-Security: max-age=31536000
< Last-Modified: Sun, 01 Aug 2021 04:46:49 GMT
< ETag: "287c05a-5c87824012f22"
< Accept-Ranges: bytes
< Content-Length: 42451034
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Headers: X-Requested-With, Content-Type
< Connection: close
<
* Closing connection 0
* schannel: shutting down SSL/TLS connection with www.ncei.noaa.gov port 443
VSICURL: GetFileSize(https://www.ncei.noaa.gov/data/global-forecast-system/access/grid-003-1.0-degree/analysis/202108/20210801/gfs_3_20210801_0000_000.grb2)=42451034  response_code=200
VSICURL: Downloading 0-16383 (https://www.ncei.noaa.gov/data/global-forecast-system/access/grid-003-1.0-degree/analysis/202108/20210801/gfs_3_20210801_0000_000.grb2)...
* Couldn't find host www.ncei.noaa.gov in the (nil) file; using defaults
* Hostname www.ncei.noaa.gov was found in DNS cache
*   Trying 205.167.25.177:443...
* Connected to www.ncei.noaa.gov (205.167.25.177) port 443 (#1)
* schannel: disabled automatic use of client certificate
> GET /data/global-forecast-system/access/grid-003-1.0-degree/analysis/202108/20210801/gfs_3_20210801_0000_000.grb2 HTTP/1.1
Host: www.ncei.noaa.gov
Accept: */*
Range: bytes=0-16383

* schannel: failed to decrypt data, need more data
* Mark bundle as not supporting multiuse
< HTTP/1.1 206 Partial Content
< Date: Mon, 20 Sep 2021 14:39:00 GMT
< Server: Apache
< Strict-Transport-Security: max-age=31536000
< Last-Modified: Sun, 01 Aug 2021 04:46:49 GMT
< ETag: "287c05a-5c87824012f22"
< Accept-Ranges: bytes
< Content-Length: 16384
< Content-Range: bytes 0-16383/42451034
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Headers: X-Requested-With, Content-Type
< Connection: close
<
* schannel: failed to decrypt data, need more data
* schannel: failed to decrypt data, need more data
* schannel: failed to decrypt data, need more data
* Closing connection 1
* schannel: shutting down SSL/TLS connection with www.ncei.noaa.gov port 443
VSICURL: Got response_code=206
VSICURL: Downloading 81920-98303 (https://www.ncei.noaa.gov/data/global-forecast-system/access/grid-003-1.0-degree/analysis/202108/20210801/gfs_3_20210801_0000_000.grb2)...
* Couldn't find host www.ncei.noaa.gov in the (nil) file; using defaults
* Hostname www.ncei.noaa.gov was found in DNS cache
*   Trying 205.167.25.177:443...
* Connected to www.ncei.noaa.gov (205.167.25.177) port 443 (#2)
* schannel: disabled automatic use of client certificate
* schannel: failed to receive handshake, SSL/TLS connection failed
* Closing connection 2
VSICURL: DownloadRegion(https://www.ncei.noaa.gov/data/global-forecast-system/access/grid-003-1.0-degree/analysis/202108/20210801/gfs_3_20210801_0000_000.grb2): response_code=0, msg=schannel: failed to receive handshake, SSL/TLS connection failed
VSICURL: Got response_code=0
GRIB: ERROR: Ran out of file in Section 7
ERROR: Problems Jumping past section 7

ERROR 4: /vsicurl/https://www.ncei.noaa.gov/data/global-forecast-system/access/grid-003-1.0-degree/analysis/202108/20210801/gfs_3_20210801_0000_000.grb2 is a grib file, but no raster dataset was successfully identified.
gdalinfo failed - unable to open '/vsicurl/https://www.ncei.noaa.gov/data/global-forecast-system/access/grid-003-1.0-degree/analysis/202108/20210801/gfs_3_20210801_0000_000.grb2'.

基本上。我目前可以使用 urllib 下载文件,然后:

file_out = open(<file.path>, 'w')
file_out.write(data.read())
grib = gdal.Open(file_out)

满足我的要求。但由于我们正在使用的系统,我们希望在临时处理期间不要将文件保存在本地。

Python 版本:

gdal                      3.3.1            py38hacca965_1    defaults
pygrib                    2.1.4            py38hceae430_0    defaults
python                    3.8.10          h7840368_1_cpython    defaults

干杯。

python-3.x gdal grib pygrib in-memory-data-grid
2个回答
1
投票

这对我有用。不过,我使用的是 AWS 上托管的 GEFS 数据。我相信 AWS 上也有 GFS。不需要帐户,因此只需更改

bucket
s3_object
名称以指向 GFS 数据

import pygrib
import boto3
from botocore import UNSIGNED
from botocore.config import Config

s3 = boto3.client('s3', config=Config(signature_version=UNSIGNED))
bucket_name = 'noaa-gefs-pds'
s3_object = 'gefs.20220425/00/atmos/pgrb2ap5/gec00.t00z.pgrb2a.0p50.f000'

obj = s3.get_object(Bucket=bucket_name, Key=s3_object)['Body'].read()
grbs = pygrib.fromstring(obj)
print(type(grbs))

0
投票

哎呀!我开始对 NOAA 的模型进行详细分析。 Nunca trabalhei com esse tipo de análise, estou tendo dificuldades para começar. Conseguiu 解决者有问题吗? caso sim、pode compartilhar 或 codigo comigo 或 sugerir 教程? 奥布里加达!

© www.soinside.com 2019 - 2024. All rights reserved.