请注意,我是 Iceberg 的新手,正在围绕它进行 POC。我已经在 AWS Athena 中创建了一个 Iceberg 表,并尝试通过 pyiceberg 连接到它。我能够成功连接到目录并获得良好的命名空间/表。但是,当调用load_table或创建表操作时,我反馈403错误。我使用连接详细信息配置了环境变量,并通过 boto3.client('s3') 对元数据文件执行等效的 HEAD 操作,效果很好。
我是否缺少 load_catalog 块中的配置?有其他人解决了此错误或对修复有任何见解吗?
catalog = catalog.load_catalog('glue', **{
'type': 'glue'
})
databases = catalog.list_namespaces()
for d in databases:
print(d[0])
tables = catalog.list_tables(d)
table = catalog.load_table('default.iceberg_test')
错误堆栈跟踪:
FutureWarning: The S3RegionRedirector class has been deprecated for a new internal replacement. A future version of botocore may remove this class.
warnings.warn(
Traceback (most recent call last):
File "/opt/homebrew/lib/python3.10/site-packages/s3fs/core.py", line 112, in _error_wrapper
return await func(*args, **kwargs)
File "/opt/homebrew/lib/python3.10/site-packages/aiobotocore/client.py", line 358, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/dataenginerd/python/utilities/pyiceberg_test.py", line 36, in <module>
table = catalog.load_table('default.iceberg_test')
File "/opt/homebrew/lib/python3.10/site-packages/pyiceberg/catalog/glue.py", line 278, in load_table
return self._convert_glue_to_iceberg(load_table_response.get(PROP_GLUE_TABLE, {}))
File "/opt/homebrew/lib/python3.10/site-packages/pyiceberg/catalog/glue.py", line 180, in _convert_glue_to_iceberg
metadata = FromInputFile.table_metadata(file)
File "/opt/homebrew/lib/python3.10/site-packages/pyiceberg/serializers.py", line 56, in table_metadata
with input_file.open() as input_stream:
File "/opt/homebrew/lib/python3.10/site-packages/pyiceberg/io/fsspec.py", line 166, in open
return self._fs.open(self.location, "rb")
File "/opt/homebrew/lib/python3.10/site-packages/fsspec/spec.py", line 1135, in open
f = self._open(
File "/opt/homebrew/lib/python3.10/site-packages/s3fs/core.py", line 640, in _open
return S3File(
File "/opt/homebrew/lib/python3.10/site-packages/s3fs/core.py", line 1989, in __init__
super().__init__(
File "/opt/homebrew/lib/python3.10/site-packages/fsspec/spec.py", line 1491, in __init__
self.size = self.details["size"]
File "/opt/homebrew/lib/python3.10/site-packages/fsspec/spec.py", line 1504, in details
self._details = self.fs.info(self.path)
File "/opt/homebrew/lib/python3.10/site-packages/fsspec/asyn.py", line 114, in wrapper
return sync(self.loop, func, *args, **kwargs)
File "/opt/homebrew/lib/python3.10/site-packages/fsspec/asyn.py", line 99, in sync
raise return_result
File "/opt/homebrew/lib/python3.10/site-packages/fsspec/asyn.py", line 54, in _runner
result[0] = await coro
File "/opt/homebrew/lib/python3.10/site-packages/s3fs/core.py", line 1210, in _info
out = await self._call_s3(
File "/opt/homebrew/lib/python3.10/site-packages/s3fs/core.py", line 339, in _call_s3
return await _error_wrapper(
File "/opt/homebrew/lib/python3.10/site-packages/s3fs/core.py", line 139, in _error_wrapper
raise err
PermissionError: Forbidden
我已尝试使用 boto3 堆栈验证相同的操作是否按预期工作。
根本问题是没有安装s3fs依赖。我跑了
pip install "pyiceberg[glue,s3fs,pyarrow]"
这解决了我一直看到的错误。希望这对将来遇到同样困境的其他人有所帮助。