Azure HDinsight 中的 PySpark 无法访问 Azure KeyVaults 的秘密值

问题描述 投票:0回答:0

我目前在 Azure HDinsight 中实施 pyspark 作业,我们有秘密存储在 Azure KeyVaults 中。理论上我们可以使用azure-sdk-for-python来访问。此外,我们还设置了用户定义的托管身份和相关角色分配。但是我们发现同样的代码进程在本地(MAC)可以运行,但是在Azure HDinsight集群中运行不了。它总是显示以下错误消息:

DefaultAzureCredential failed to retrieve a token from the included credentials.
Attempted credentials:
    EnvironmentCredential: EnvironmentCredential authentication unavailable. Environment variables are not fully configured.
Visit https://aka.ms/azsdk/python/identity/environmentcredential/troubleshoot to troubleshoot.this issue.
    ManagedIdentityCredential: ManagedIdentityCredential authentication unavailable. The requested identity has not been assigned to this resource.
    SharedTokenCacheCredential: Shared token cache unavailable
    AzureCliCredential: Please run 'az login' to set up an account
    AzurePowerShellCredential: PowerShell is not installed
To mitigate this issue, please refer to the troubleshooting guidelines here at https://aka.ms/azsdk/python/identity/defaultazurecredential/troubleshoot.
Traceback (most recent call last):
  File "/usr/bin/miniforge/envs/py38/lib/python3.8/site-packages/azure/core/tracing/decorator.py", line 78, in wrapper_use_tracer
    return func(*args, **kwargs)
  File "/usr/bin/miniforge/envs/py38/lib/python3.8/site-packages/azure/keyvault/secrets/_client.py", line 72, in get_secret
    bundle = self._client.get_secret(
  File "/usr/bin/miniforge/envs/py38/lib/python3.8/site-packages/azure/keyvault/secrets/_generated/_operations_mixin.py", line 1574, in get_secret
    return mixin_instance.get_secret(vault_base_url, secret_name, secret_version, **kwargs)
  File "/usr/bin/miniforge/envs/py38/lib/python3.8/site-packages/azure/core/tracing/decorator.py", line 78, in wrapper_use_tracer
    return func(*args, **kwargs)
  File "/usr/bin/miniforge/envs/py38/lib/python3.8/site-packages/azure/keyvault/secrets/_generated/v7_3/operations/_key_vault_client_operations.py", line 694, in get_secret
    pipeline_response = self._client._pipeline.run(  # pylint: disable=protected-access
  File "/usr/bin/miniforge/envs/py38/lib/python3.8/site-packages/azure/core/pipeline/_base.py", line 205, in run
    return first_node.send(pipeline_request)  # type: ignore
  File "/usr/bin/miniforge/envs/py38/lib/python3.8/site-packages/azure/core/pipeline/_base.py", line 69, in send
    response = self.next.send(request)
  File "/usr/bin/miniforge/envs/py38/lib/python3.8/site-packages/azure/core/pipeline/_base.py", line 69, in send
    response = self.next.send(request)
  File "/usr/bin/miniforge/envs/py38/lib/python3.8/site-packages/azure/core/pipeline/_base.py", line 69, in send
    response = self.next.send(request)
  [Previous line repeated 2 more times]
  File "/usr/bin/miniforge/envs/py38/lib/python3.8/site-packages/azure/core/pipeline/policies/_redirect.py", line 160, in send
    response = self.next.send(request)
  File "/usr/bin/miniforge/envs/py38/lib/python3.8/site-packages/azure/core/pipeline/policies/_retry.py", line 474, in send
    response = self.next.send(request)
  File "/usr/bin/miniforge/envs/py38/lib/python3.8/site-packages/azure/core/pipeline/policies/_authentication.py", line 115, in send
    self.on_request(request)
  File "/usr/bin/miniforge/envs/py38/lib/python3.8/site-packages/azure/keyvault/secrets/_shared/challenge_auth_policy.py", line 78, in on_request
    self._token = self._credential.get_token(scope, tenant_id=challenge.tenant_id)
  File "/usr/bin/miniforge/envs/py38/lib/python3.8/site-packages/azure/identity/_credentials/default.py", line 168, in get_token
    return super(DefaultAzureCredential, self).get_token(*scopes, **kwargs)
  File "/usr/bin/miniforge/envs/py38/lib/python3.8/site-packages/azure/identity/_credentials/chained.py", line 101, in get_token
    raise ClientAuthenticationError(message=message)
azure.core.exceptions.ClientAuthenticationError: DefaultAzureCredential failed to retrieve a token from the included credentials.
Attempted credentials:
    EnvironmentCredential: EnvironmentCredential authentication unavailable. Environment variables are not fully configured.
Visit https://aka.ms/azsdk/python/identity/environmentcredential/troubleshoot to troubleshoot.this issue.
    ManagedIdentityCredential: ManagedIdentityCredential authentication unavailable. The requested identity has not been assigned to this resource.
    SharedTokenCacheCredential: Shared token cache unavailable
    AzureCliCredential: Please run 'az login' to set up an account
    AzurePowerShellCredential: PowerShell is not installed
To mitigate this issue, please refer to the troubleshooting guidelines here at https://aka.ms/azsdk/python/identity/defaultazurecredential/troubleshoot.

但我在 `DefaultAzureCredential()` 中有特定的相关托管身份,但始终无法在 Azure HDinsight 集群中运行。如果您有任何想法或其他解决方案,我将不胜感激。

我在我的本地计算机上尝试了相同的代码并且它可以工作。只是无法在 Azure HDInsight 集群中工作。

pyspark azure-keyvault azure-hdinsight azure-managed-identity
© www.soinside.com 2019 - 2024. All rights reserved.